Book Image

Hands-On Data Warehousing with Azure Data Factory

By : Christian Cote, Michelle Gutzait, Giuseppe Ciaburro
Book Image

Hands-On Data Warehousing with Azure Data Factory

By: Christian Cote, Michelle Gutzait, Giuseppe Ciaburro

Overview of this book

ETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick’s Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them.
Table of Contents (12 chapters)

SQL Azure database


We'll now set up a database that will be used by our factory to copy data from. The Wide World Importers sample database is available at: https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0.

A BACPAC is a file that contains database structures and data, similar to a database backup. The difference is that a BACPAC is a snapshot of a database at a specific time. A database backup is much more than that: the database can be restored up to the last few seconds. Also, a database backup can be incremental—that is, contain data and structures since the last backup. A BACPAC always contains all data.

The version we're using is the standard one, as shown in the following screenshot:

We'll now upload the BACPAC to the storage that we created in the previous section:

  1. Open Microsoft Azure Storage Explorer and right-click on the adfv2book storage account to create a container called database-bacpac, as shown in the following screenshot:

  1. Now, click Upload...