Book Image

Mastering SQL Server 2017

By : Miloš Radivojević, Dejan Sarka, William Durkin, Christian Cote, Matija Lah
Book Image

Mastering SQL Server 2017

By: Miloš Radivojević, Dejan Sarka, William Durkin, Christian Cote, Matija Lah

Overview of this book

Microsoft SQL Server 2017 uses the power of R and Python for machine learning and containerization-based deployment on Windows and Linux. By learning how to use the features of SQL Server 2017 effectively, you can build scalable apps and easily perform data integration and transformation. You’ll start by brushing up on the features of SQL Server 2017. This Learning Path will then demonstrate how you can use Query Store, columnstore indexes, and In-Memory OLTP in your apps. You'll also learn to integrate Python code in SQL Server and graph database implementations for development and testing. Next, you'll get up to speed with designing and building SQL Server Integration Services (SSIS) data warehouse packages using SQL server data tools. Toward the concluding chapters, you’ll discover how to develop SSIS packages designed to maintain a data warehouse using the data flow and other control flow tasks. By the end of this Learning Path, you'll be equipped with the skills you need to design efficient, high-performance database applications with confidence. This Learning Path includes content from the following Packt books: SQL Server 2017 Developer's Guide by Miloš Radivojevi?, Dejan Sarka, et. al SQL Server 2017 Integration Services Cookbook by Christian Cote, Dejan Sarka, et. al
Table of Contents (20 chapters)
Title Page
Free Chapter
1
Introduction to SQL Server 2017

Leveraging a HDInsight big data cluster

So far, we've managed Blobs data using SSIS. In this case, the data was at rest and SSIS was used to manipulate it. SSIS was the orchestration service in Azure parlance. As stated in the introduction, SSIS can only be used on- premises and, so far, on a single machine.

The goal of this recipe is to use Azure HDInsight computation services. These services allow us to use (rent) powerful resources as a cluster of machines. These machines can run Linux or Windows according to user choice, but be aware that Windows will be deprecated for the newest version of HDInsight. Such clusters or machines, as fast and powerful as they can be, are very expensive to use. In fact, this is quite normal; we're talking about a potentially large amount of hardware here.

For this reason, unless we want to have these computing resource running continuously...