Book Image

SQL Server 2017 Integration Services Cookbook

By : Christian Cote, Dejan Sarka, David Peter Hansen, Matija Lah, Samuel Lester, Christo Olivier
Book Image

SQL Server 2017 Integration Services Cookbook

By: Christian Cote, Dejan Sarka, David Peter Hansen, Matija Lah, Samuel Lester, Christo Olivier

Overview of this book

SQL Server Integration Services is a tool that facilitates data extraction, consolidation, and loading options (ETL), SQL Server coding enhancements, data warehousing, and customizations. With the help of the recipes in this book, you’ll gain complete hands-on experience of SSIS 2017 as well as the 2016 new features, design and development improvements including SCD, Tuning, and Customizations. At the start, you’ll learn to install and set up SSIS as well other SQL Server resources to make optimal use of this Business Intelligence tools. We’ll begin by taking you through the new features in SSIS 2016/2017 and implementing the necessary features to get a modern scalable ETL solution that fits the modern data warehouse. Through the course of chapters, you will learn how to design and build SSIS data warehouses packages using SQL Server Data Tools. Additionally, you’ll learn to develop SSIS packages designed to maintain a data warehouse using the Data Flow and other control flow tasks. You’ll also be demonstrated many recipes on cleansing data and how to get the end result after applying different transformations. Some real-world scenarios that you might face are also covered and how to handle various issues that you might face when designing your packages. At the end of this book, you’ll get to know all the key concepts to perform data integration and transformation. You’ll have explored on-premises Big Data integration processes to create a classic data warehouse, and will know how to extend the toolbox with custom tasks and transforms.
Table of Contents (18 chapters)
Title Page
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface

Preface

SQL Server Integration Services is a tool that facilitates data extraction, consolidation, and loading options (ETL), SQL Server coding enhancements, data warehousing, and customizations. With the help of the recipes in this book, you'll gain hands-on experience of SSIS 2017 as well as the new 2016 features, design and development improvements including SCD, tuning, and customizations. At the start, you'll learn to install and set up SSIS as well other SQL Server resources to make optimal use of this business intelligence tool. We'll begin by taking you through the new features in SSIS 2016/2017 and implementing the necessary features to get a modern scalable ETL solution that fits the modern data warehouse. Through the course of the book, you will learn how to design and build SSIS data warehouses packages using SQL Server Data Tools. Additionally, you'll learn how to develop SSIS packages designed to maintain a data warehouse using the data flow and other control flow tasks. You'll also go through many recipes on cleansing data and how to get the end result after applying different transformations. Some real-world scenarios that you might face are also covered and how to handle various issues that you might face when designing your packages. At the end of this book, you'll get to know all the key concepts to perform data integration and transformation. You'll have explored on-premises big data integration processes to create a classic data warehouse, and will know how to extend the toolbox with custom tasks and transforms.

What this book covers

Chapter 1, SSIS Setup, contains recipes describing the step by step setup of SQL Server 2016 to get the features that are used in the book.

Chap ter 2, What Is New in SSIS 2016, contains recipes that talk about the evolution of SSIS over time and what's new in SSIS 2016. This chapter is a detailed overview of Integration Services 2016, new features.

Chapte r 3, Key Components of a Modern ETL Solution, explains how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse. This chapter will also describe what each catalog view provides and will help you learn how you can use some of them to archive SSIS execution statistics.

Chapt er 4, Data Warehouse Loading Techniques, describes many patterns used when it comes to data warehouse or ODS load. You will learn how to effectively load a data warehouse and process a tabular model, maintain data partitions and modern data refresh rates.

Chapte r 5, Dealing with Data Quality, focuses on how SSIS can be leveraged to validate and load data. You will learn how to identify invalid data, cleanse data and load valid data to the data warehouse.

Chapt er 6, SSIS Performance and Scalability, will talk about how to monitor SSIS package execution. It will also provide solutions to scale out processes by using parallelism. You will learn how to identify bottlenecks and how to resolve them using various techniques.

Chapte r 7, Unleash the Power of SSIS Script Task and Component, covers how to use scripting with SSIS. You will learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.

Chap ter 8, SSIS and Advanced Analytics, talks about how SSIS can be used to prepare the data you need for further analysis. Here, you will learn how you can make use of SQL Server Analysis Services (SSAS) and R models in the SSIS data flow.

Cha pter 9, On-Premises and Azure Big Data Integration, describes the Azure feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. You will learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture and integrate on-premises Big Data can be manipulated via SSIS.

Chapt er 10, Extending SSIS Tasks and Transformations, talks about extending and customizing the toolbox using custom developed tasks and transforms and security features. You will learn the pros and cons of creating custom tasks to extend the SSIS toolbox and secure your deployment.

Chapter 11 , Scale Out with SSIS 2017, talks about scaling out SSIS package executions on multiple servers. You will learn how SSIS 2017 can scale out to multiple workers to enhance execution scalability.

What you need for this book

This book was written using SQL Server 2016 and all the examples and functions should work with it. Other tools you may need are Visual Studio 2015, SQL Data Tools 16 or higher and SQL Server Management Studio 17 or later.

In addition to that, you will need Hortonworks Sandbox Docker for Windows Azure account and Microsoft Azure.

The last chapter of this book has been written using SQL Server 2017.

Who this book is for

This book is ideal for software engineers, DW/ETL architects, and ETL developers who need to create a new, or enhance an existing, ETL implementation with SQL Server 2017 Integration Services. This book would also be good for individuals who develop ETL solutions that use SSIS and are keen to learn the new features and capabilities in SSIS 2017.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it, How it works, There's more, and See also). To give clear instructions on how to complete a recipe, we use these sections as follows:

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it...

This section contains the steps required to follow the recipe.

How it works...

This section usually consists of a detailed explanation of what happened in the previous section.

There's more...

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The last characters CI and AS are for case insensitive and accent sensitive, respectively." A block of code is set as follows:

USE DQS_STAGING_DATA; 
SELECT CustomerKey, FullName, StreetAddress, City, StateProvince, CountryRegion, EmailAddress, BirthDate, Occupation;

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Click on the Sign in visible at the right (top) to log into Visual Studio Dev Essentials."

Note

Warnings or important notes appear in a box like this.

Note

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors .

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:

  1. Log in or register to our website using your e-mail address and password.
  2. Hover the mouse pointer on the SUPPORT tab at the top.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box.
  5. Select the book for which you're looking to download the code files.
  6. Choose from the drop-down menu where you purchased this book from.
  7. Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR / 7-Zip for Windows
  • Zipeg / iZip / UnRarX for Mac
  • 7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/SQL-Server-2017-Integration-Services-Cookbook. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/ . Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/SQLServer2017IntegrationServicesCookbook_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.