Book Image

Hands-On Data Warehousing with Azure Data Factory

By : Christian Cote, Michelle Gutzait, Giuseppe Ciaburro
Book Image

Hands-On Data Warehousing with Azure Data Factory

By: Christian Cote, Michelle Gutzait, Giuseppe Ciaburro

Overview of this book

ETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick’s Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them.
Table of Contents (12 chapters)

Machine learning tasks


When we first venture into the use of artificial intelligence for data analysis, the first problem we are faced with is to choose the most appropriate algorithm for solving a specific problem. Analyzing the available algorithms, we immediately realize that the choice is not so immediate and requires an appropriate investigation.

A first approach to the problem involves the specification of the task that our machine learning algorithm will have to face. In this sense we can rest assured: there are only a handful of tasks to be analyzed even if, for each of these activities, different approaches and algorithms are available.

In fact, even if all machine learning algorithms take the same data as input, what they'll want to achieve is different. Machine learning algorithms can generally be classified into a few groups based on the tasks they were designed to solve. The typical activities in any automatic learning are as follows:

  • Regression
  • Classification
  • Clustering
  • Dimensionality...