Book Image

Data Engineering with Alteryx

By : Paul Houghton
Book Image

Data Engineering with Alteryx

By: Paul Houghton

Overview of this book

Alteryx is a GUI-based development platform for data analytic applications. Data Engineering with Alteryx will help you leverage Alteryx’s code-free aspects which increase development speed while still enabling you to make the most of the code-based skills you have. This book will teach you the principles of DataOps and how they can be used with the Alteryx software stack. You’ll build data pipelines with Alteryx Designer and incorporate the error handling and data validation needed for reliable datasets. Next, you’ll take the data pipeline from raw data, transform it into a robust dataset, and publish it to Alteryx Server following a continuous integration process. By the end of this Alteryx book, you’ll be able to build systems for validating datasets, monitoring workflow performance, managing access, and promoting the use of your data sources.
Table of Contents (18 chapters)
1
Part 1: Introduction
5
Part 2: Functional Steps in DataOps
11
Part 3: Governance of DataOps

Summary

In this chapter, we learned why Alteryx Connect is a good tool for managing our data asset discovery and how that helps our DataOps deployment. We learned how to use the Connect navigation and search functionality to find and understand our data assets. We discovered the community knowledge-sharing functions available in Connect to build a rich understanding of what data resources exist and how we can access them.

We then learned the methods for populating our data dictionary. Next, we explored the different ways of populating the Alteryx Connect data catalog. The first was to populate the analytic apps directly in Connect, using the Alteryx Server API to run the workflow in the background. Second, we learned how to deploy the Connect loader apps in Alteryx Designer and scheduled those apps with the methods we learned in Chapter 2, Data Engineering with Alteryx. Finally, we learned the final technique to directly update the Connect data catalog with the APIs. This provided...