Book Image

Learning Alteryx

Book Image

Learning Alteryx

Overview of this book

Alteryx, as a leading data blending and advanced data analytics platform, has taken self-service data analytics to the next level. Companies worldwide often find themselves struggling to prepare and blend massive datasets that are time-consuming for analysts. Alteryx solves these problems with a repeatable workflow designed to quickly clean, prepare, blend, and join your data in a seamless manner. This book will set you on a self-service data analytics journey that will help you create efficient workflows using Alteryx, without any coding involved. It will empower you and your organization to take well-informed decisions with the help of deeper business insights from the data.Starting with the fundamentals of using Alteryx such as data preparation and blending, you will delve into the more advanced concepts such as performing predictive analytics. You will also learn how to use Alteryx’s features to share the insights gained with the relevant decision makers. To ensure consistency, we will be using data from the Healthcare domain throughout this book. The knowledge you gain from this book will guide you to solve real-life problems related to Business Intelligence confidently. Whether you are a novice with Alteryx or an experienced data analyst keen to explore Alteryx’s self-service analytics features, this book will be the perfect companion for you.
Table of Contents (17 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

The Alteryx Designer architecture


The Alteryx Designer is an intuitive drag-and-drop user interface for users to drag tools from a Tool Palette onto the canvas. These tools can be used to create Alteryx workflows, macros, and applications. This allows the users to run workflows instantly to process data. Alteryx Designer processes workflows from a local instance of the Alteryx Engine and is written primarily in C#. Users may publish their workflows, macros, and applications to the Alteryx Analytics gallery, where others can download and run them. Workflows can be scheduled at fixed times or at recurring intervals through the Alteryx Server deployment. Alteryx Designer has a Scheduler interface located within it to execute scheduled workflows.

The Alteryx Engine, written in C++, runs a workflow and produces the output from a workflow built in Alteryx Designer. The Engine processes the data sources in-memory once the workflow is running. Processing will be written to temporary files on a disk and deleted once the processing is complete after surpasses memory limitations. We installed Alteryx by selecting the option to install the suite of R tools used for predictive analysis.

Alteryx installs the R tools, used for statistical computing and graphics, through the R program and provides a connection between the Alteryx Engine and the R Engine. This allows for the tools to function in the workflow. A command line is used by the Alteryx Engine to communicate with the R Engine.

The Alteryx Engine may execute the following tasks depending on the workflow:

  • Read or write input/output files and one or more databases
  • Process external runtime commands
  • Send email to the email server through SMTP
  • Upload or download data from the web

Let's dive a little deeper into the Alteryx Engine on how it gets deployed across multiple servers. The Alteryx Service, written in C++ and C# wrappers, allows the Alteryx Engine to deploy the execution of workflows, management, and scheduling. This is accomplished by using a Controller-Worker architecture. The server utilizes the Controller to manage the jobs scheduled to run and the Worker performs the work. The Alteryx application files and job queues are stored by the Alteryx Persistence tier to perform the operations of the Alteryx Service.

The Alteryx Service Controller is responsible for the delegation of work and management of the service settings to the Alteryx Service Workers. When jobs are received from the Scheduler, the Controller views them within the persistence layer, where all queued jobs are maintained, and then delegates the jobs to the workers. This is where Alteryx Service Worker comes into action, as the Worker runs the job and produces the output. The system performance determines how many Workers are needed to run the jobs.

Note

The Controller's name or the IP address and the security token for that Controller must be specified for the Controller-Worker to communicate if the Worker is not the same machine as the Controller.

The Alteryx architecture process flow diagram begins from the drag-and-drop workflow tools to executing results through the Alteryx Engine: