Book Image

Data Engineering with Alteryx

By : Paul Houghton
Book Image

Data Engineering with Alteryx

By: Paul Houghton

Overview of this book

Alteryx is a GUI-based development platform for data analytic applications. Data Engineering with Alteryx will help you leverage Alteryx’s code-free aspects which increase development speed while still enabling you to make the most of the code-based skills you have. This book will teach you the principles of DataOps and how they can be used with the Alteryx software stack. You’ll build data pipelines with Alteryx Designer and incorporate the error handling and data validation needed for reliable datasets. Next, you’ll take the data pipeline from raw data, transform it into a robust dataset, and publish it to Alteryx Server following a continuous integration process. By the end of this Alteryx book, you’ll be able to build systems for validating datasets, monitoring workflow performance, managing access, and promoting the use of your data sources.
Table of Contents (18 chapters)
1
Part 1: Introduction
5
Part 2: Functional Steps in DataOps
11
Part 3: Governance of DataOps

What is Alteryx Connect, and how does it help DataOps?

Alteryx Connect is the data catalog and collaboration hub for your data assets. Connect provides a central point for describing datasets and understanding data relationships. It also provides a location for cross-department knowledge-sharing regarding the departments' datasets.

One example of a data catalog is Kaggle Datasets (https://www.kaggle.com/datasets). This site is a central repository for the discovery of Kaggle datasets and provides the context information for any datasets that have been produced. The catalog ensures that all Kaggle users understand what the datasets are and what the fields mean, meaning their analysis can be comparable.

According to the Alteryx Connect Product page (https://www.alteryx.com/products/alteryx-connect), Connect is described as:

A powerful data catalog combined with advanced analytics empowers everyone to quickly find, manage, understand, and collaborate across departments...