Chapter 6: Data Flows in Azure Data Factory

Book Overview & Buying
Table Of Contents

Azure Data Engineering Cookbook

By : Ahmad Osama, Nagaraj Venkatesan

4.2 (12)

Buy this Book

Azure Data Engineering Cookbook

4.2 (12)

By: Ahmad Osama, Nagaraj Venkatesan

Buy this Book

Overview of this book

Data engineering is one of the faster growing job areas as Data Engineers are the ones who ensure that the data is extracted, provisioned and the data is of the highest quality for data analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis. It takes you through different techniques for performing big data engineering using Microsoft Azure Data services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You'll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you'll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you'll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You’ll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you'll learn how to process streaming data using Azure Stream Analytics and Data Explorer. By the end of this Azure book, you'll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the color images

Conventions used

Sections

Get in touch

Reviews

Chapter 1: Working with Azure Blob Storage

Technical requirements

Provisioning an Azure storage account using the Azure portal

Provisioning an Azure storage account using PowerShell

Creating containers and uploading files to Azure Blob storage using PowerShell

Managing blobs in Azure Storage using PowerShell

Managing an Azure blob snapshot in Azure Storage using PowerShell

Configuring blob life cycle management for blob objects using the Azure portal

Configuring a firewall for an Azure storage account using the Azure portal

Configuring virtual networks for an Azure storage account using the Azure portal

Configuring a firewall for an Azure storage account using PowerShell

Configuring virtual networks for an Azure storage account using PowerShell

Creating an alert to monitor an Azure storage account

Securing an Azure storage account with SAS using PowerShell

Free Chapter

Chapter 2: Working with Relational Databases in Azure

Provisioning and connecting to an Azure SQL database using PowerShell

Provisioning and connecting to an Azure PostgreSQL database using the Azure CLI

Provisioning and connecting to an Azure MySQL database using the Azure CLI

Implementing active geo-replication for an Azure SQL database using PowerShell

Implementing an auto-failover group for an Azure SQL database using PowerShell

Implementing vertical scaling for an Azure SQL database using PowerShell

Implementing an Azure SQL database elastic pool using PowerShell

Monitoring an Azure SQL database using the Azure portal

Chapter 3: Analyzing Data with Azure Synapse Analytics

Technical requirements

Provisioning and connecting to an Azure Synapse SQL pool using PowerShell

Pausing or resuming a Synapse SQL pool using PowerShell

Scaling an Azure Synapse SQL pool instance using PowerShell

Loading data into a SQL pool using PolyBase with T-SQL

Loading data into a SQL pool using the COPY INTO statement

Implementing workload management in an Azure Synapse SQL pool

Optimizing queries using materialized views in Azure Synapse Analytics

Chapter 4: Control Flow Activities in Azure Data Factory

Technical requirements

Implementing control flow activities

Implementing control flow activities – Lookup and If activities

Triggering a pipeline in Azure Data Factory

Chapter 5: Control Flow Transformation and the Copy Data Activity in Azure Data Factory

Technical requirements

Implementing HDInsight Hive and Pig activities

Implementing an Azure Functions activity

Implementing a Data Lake Analytics U-SQL activity

Copying data from Azure Data Lake Gen2 to an Azure Synapse SQL pool using the copy activity

Copying data from Azure Data Lake Gen2 to Azure Cosmos DB using the copy activity

Chapter 6: Data Flows in Azure Data Factory

Technical requirements

Implementing incremental data loading with a mapping data flow

Implementing a wrangling data flow

Chapter 7: Azure Data Factory Integration Runtime

Technical requirements

Configuring a self-hosted IR

Configuring a shared self-hosted IR

Migrating an SSIS package to Azure Data Factory

Executing an SSIS package with an on-premises data store

Chapter 8: Deploying Azure Data Factory Pipelines

Technical requirements

Configuring the development, test, and production environments

Deploying Azure Data Factory pipelines using the Azure portal and ARM templates

Automating Azure Data Factory pipeline deployment using Azure DevOps

Chapter 9: Batch and Streaming Data Processing with Azure Databricks

Technical requirements

Configuring the Azure Databricks environment

Transforming data using Python

Transforming data using Scala

Working with Delta Lake

Processing structured streaming data with Azure Databricks

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Leave a review - let other readers know what you think

Azure Data Engineering Cookbook

By : Ahmad Osama, Nagaraj Venkatesan

Azure Data Engineering Cookbook

By: Ahmad Osama, Nagaraj Venkatesan

Overview of this book

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access