Book Image

Hands-On Data Warehousing with Azure Data Factory

By : Christian Cote, Michelle Gutzait, Giuseppe Ciaburro
Book Image

Hands-On Data Warehousing with Azure Data Factory

By: Christian Cote, Michelle Gutzait, Giuseppe Ciaburro

Overview of this book

ETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick’s Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them.
Table of Contents (12 chapters)

SSIS in ADF


SQL Server Integration Services (SSIS) has been the Microsoft ETL predilection tool for more than a decade. A lot of enterprises have used SSIS to load their on-premises data warehouses since its inception in SQL Server 2005.

In the last couple of years, IT departments have had to deal with different kinds of data and specific toolsets to process them. SSIS has successfully been able to access cloud data from on-premises ETL servers since 2015 with the Azure Feature Pack (https://docs.microsoft.com/en-us/sql/integration-services/azure-feature-pack-for-integration-services-ssis?view=sql-server-2017). However, issues occur when most of the ETL is in the cloud and SSIS is in a small part of the chain. And, up until now, it was very complex to use ADF V1 as the orchestrator in the cloud, with some SSIS package calls in the pipeline.

The following sections will describe how SSIS on-premises can be successfully leveraged to interact with cloud data in ADF V2.

Sample setup

The first thing...