Book Image

Data Engineering with Alteryx

By : Paul Houghton
Book Image

Data Engineering with Alteryx

By: Paul Houghton

Overview of this book

Alteryx is a GUI-based development platform for data analytic applications. Data Engineering with Alteryx will help you leverage Alteryx’s code-free aspects which increase development speed while still enabling you to make the most of the code-based skills you have. This book will teach you the principles of DataOps and how they can be used with the Alteryx software stack. You’ll build data pipelines with Alteryx Designer and incorporate the error handling and data validation needed for reliable datasets. Next, you’ll take the data pipeline from raw data, transform it into a robust dataset, and publish it to Alteryx Server following a continuous integration process. By the end of this Alteryx book, you’ll be able to build systems for validating datasets, monitoring workflow performance, managing access, and promoting the use of your data sources.
Table of Contents (18 chapters)
1
Part 1: Introduction
5
Part 2: Functional Steps in DataOps
11
Part 3: Governance of DataOps

Securing the data environment

When looking to secure Alteryx inside your data environment, you need to consider where your data is and how it will move around your environment. Of course, the ideal situation is that most of your Alteryx environment is inside a private network, with only the gallery accessible to your users and not external parties.

This section will focus on deploying your environment in an Amazon Web Services (AWS) environment. I use this as an example as it will show the processes we are trying to achieve when deploying an Alteryx Server, but the plans will apply to any other cloud provider. If you want a more in-depth look at working with AWS, you could read Packt's AWS Certified Solutions Architect – Associate Guide, by Gabriel Ramirez and Stuart Scott (https://www.packtpub.com/product/aws-certified-solutions-architect-associate-guide/9781789130669).

When looking to secure the environment, we will look at two scenarios: a single node server and...