Book Image

Data Engineering with Alteryx

By : Paul Houghton
Book Image

Data Engineering with Alteryx

By: Paul Houghton

Overview of this book

Alteryx is a GUI-based development platform for data analytic applications. Data Engineering with Alteryx will help you leverage Alteryx’s code-free aspects which increase development speed while still enabling you to make the most of the code-based skills you have. This book will teach you the principles of DataOps and how they can be used with the Alteryx software stack. You’ll build data pipelines with Alteryx Designer and incorporate the error handling and data validation needed for reliable datasets. Next, you’ll take the data pipeline from raw data, transform it into a robust dataset, and publish it to Alteryx Server following a continuous integration process. By the end of this Alteryx book, you’ll be able to build systems for validating datasets, monitoring workflow performance, managing access, and promoting the use of your data sources.
Table of Contents (18 chapters)
1
Part 1: Introduction
5
Part 2: Functional Steps in DataOps
11
Part 3: Governance of DataOps

Using Git and GitHub Actions for continuous integration

Applying standards and maintaining the change history is a process that can be achieved using a Software Version Control (SVC) system such as Git. The version control allows you to monitor each change in a workflow XML, the text code that defines the workflow. By using SVC, in addition to the published version control that exists in an Alteryx workflow, you can have rich monitoring and collaboration on a project.

Implementing SVC also enables us to treat analytics as code, part of the Delivery pillar of DataOps. This allows us to manage the datasets, and the Alteryx workflows that create them, easily switching between versions should a recovery be needed.

Saving workflow changes with Git

When you want to work with Git for version control, there are two parts that you will need to understand:

  • The Git control system
  • The repository management process

The Git control system allows for recording and managing...