Book Image

Hands-On Data Warehousing with Azure Data Factory

By : Christian Cote, Michelle Gutzait, Giuseppe Ciaburro
Book Image

Hands-On Data Warehousing with Azure Data Factory

By: Christian Cote, Michelle Gutzait, Giuseppe Ciaburro

Overview of this book

ETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick’s Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them.
Table of Contents (12 chapters)

Preface

Extract, Transform, and Load (ETL) is one of the essential techniques in data processing. Given that data is everywhere, ETL will always be the best way to handle data from different sources.

This book starts with the basic concepts of data warehousing and ETL. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, machine learning, and Databrick's Spark, with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services in a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights.

By the end of this book, you will not only know how to build your own ETL solutions, but will also be able to address the key challenges that are faced while building them.

Who this book is for

This book is for you if you are a software professional who develops and implements ETL solutions using Microsoft SQL Server or Azure Cloud. It will be an added advantage if you are a software engineer, DW/ETL architect, or ETL developer and know how to create a new ETL implementation or enhance an existing one with Azure Data Factory or SSIS.

What this book covers

Chapter 1, The Modern Data Warehouse, teaches us the various storage options available in Microsoft Azure that will help us to set up our Azure factory.

Chapter 2, Getting Started with Our First Data Factory, uses the data factory to move data from Azure SQL to Azure storage.

Chapter 3, SSIS Lift and Shift, digs further into the various services available in Azure, as well as how we can integrate an existing SSIS solution into the factory.

 Chapter 4, Azure Data Lake,primarily focuses on the components of the Azure Data Lake and provides a basic implementation of those components.

 Chapter 5, Machine Learning on the Cloud, recognizes the different machine learning algorithms and the tools that Microsoft Azure Machine Learning Studio provides to handle them.

Chapter 6, Introduction to Azure Databricks, shows how Azure Data Factory can trigger Databricks notebook.

Chapter 7, Reporting on the Modern Data Warehouse, explains how we can integrate this data into a Power BI report.

To get the most out of this book

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packtpub.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Data-Warehousing-with-Azure-Data-Factory. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/HandsOnDataWarehousingwithAzureDataFactory_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "When we click on it, the adfv2book blade opens."

A block of code is set as follows:

SELECT [CustomerID] 
      ,[CustomerName] 
      ,[CustomerCategoryName] 
      ,[PrimaryContact] 
      ,[AlternateContact] 
      ,[PhoneNumber] 
FROM [Website].[Customers] 

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select Databases and choose SQL Database, as shown in the following screenshot."

Note

Warnings or important notes appear like this.

Note

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.