Book Image

Hands-On Data Warehousing with Azure Data Factory

By : Christian Cote, Michelle Gutzait, Giuseppe Ciaburro
Book Image

Hands-On Data Warehousing with Azure Data Factory

By: Christian Cote, Michelle Gutzait, Giuseppe Ciaburro

Overview of this book

ETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick’s Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them.
Table of Contents (12 chapters)

Types of blobs

There are several types of blobs for different usage. The next sections briefly describe the various blob types. For more information, please see the following link:

Block blobs

Block blobs are used by most Azure data transfers. They can store application files (CSV, ZIP, and so on), tables used by NoSQL applications, and queues used by streaming services such as Azure ML (short for Azure Machine Learning). Throughout the examples in this book, we'll use block blobs as storage for some of our data transfers.

Page blobs

Page blobs are used for large file storage. Azure Virtual Machines (VMs) use this type of blob to store their disk image.

Replication of storage

Replication of storage represents how the blobs are replicated to ensure the safety of their contents in case of hardware failure. When we create a blob, one of the options we have to select is the replication type:

  • LRS (short for Local Redundant Storage): This storage replicates...