Book Image

Azure Data Engineering Cookbook - Second Edition

By : Nagaraj Venkatesan, Ahmad Osama
Book Image

Azure Data Engineering Cookbook - Second Edition

By: Nagaraj Venkatesan, Ahmad Osama

Overview of this book

The famous quote 'Data is the new oil' seems more true every day as the key to most organizations' long-term success lies in extracting insights from raw data. One of the major challenges organizations face in leveraging value out of data is building performant data engineering pipelines for data visualization, ingestion, storage, and processing. This second edition of the immensely successful book by Ahmad Osama brings to you several recent enhancements in Azure data engineering and shares approximately 80 useful recipes covering common scenarios in building data engineering pipelines in Microsoft Azure. You’ll explore recipes from Azure Synapse Analytics workspaces Gen 2 and get to grips with Synapse Spark pools, SQL Serverless pools, Synapse integration pipelines, and Synapse data flows. You’ll also understand Synapse SQL Pool optimization techniques in this second edition. Besides Synapse enhancements, you’ll discover helpful tips on managing Azure SQL Database and learn about security, high availability, and performance monitoring. Finally, the book takes you through overall data engineering pipeline management, focusing on monitoring using Log Analytics and tracking data lineage using Azure Purview. By the end of this book, you’ll be able to build superior data engineering pipelines along with having an invaluable go-to guide.
Table of Contents (16 chapters)

Securing and Monitoring Data in Azure Data Lake

Data Lake forms the key storage layer for data engineering pipelines. Security and the monitoring of Data Lake accounts are key aspects of Data Lake maintenance. This chapter will focus on configuring security controls such as firewalls, encryption, and creating private links to a Data Lake account. By the end of this chapter, you will have learned how to configure a firewall, virtual network, and private link to secure the Data Lake, encrypt Data Lake using Azure Key Vault, and monitor key user actions in Data Lake.

We will be covering the following recipes in this chapter:

  • Configuring a firewall for an Azure Data Lake account using the Azure portal
  • Configuring virtual networks for an Azure Data Lake account using the Azure portal
  • Configuring private links for an Azure Data Lake account
  • Configuring encryption using Azure Key Vault for Azure Data Lake
  • Accessing Blob storage accounts using managed identities
  • Creating an alert to monitor an Azure Data Lake account
  • Securing an Azure Data Lake account with an SAS using PowerShell