Book Image

Microsoft SQL Server 2012 Integration Services: An Expert Cookbook

Book Image

Microsoft SQL Server 2012 Integration Services: An Expert Cookbook

Overview of this book

SQL Server Integration Services (SSIS) is a leading tool in the data warehouse industry - used for performing extraction, transformation, and load operations. This book is aligned with the most common methodology associated with SSIS known as Extract Transform and Load (ETL); ETL is responsible for the extraction of data from several sources, their cleansing, customization, and loading into a central repository normally called Data Warehouse or Data Mart.Microsoft SQL Server 2012 Integration Services: An Expert Cookbook covers all the aspects of SSIS 2012 with lots of real-world scenarios to help readers understand usages of SSIS in every environment. Written by two SQL Server MVPs who have in-depth knowledge of SSIS having worked with it for many years.This book starts by creating simple data transfer packages with wizards and illustrates how to create more complex data transfer packages, troubleshoot packages, make robust SSIS packages, and how to boost the performance of data consolidation with SSIS. It then covers data flow transformations and advanced transformations for data cleansing, fuzzy and term extraction in detail. The book then dives deep into making a dynamic package with the help of expressions and variables, and performance tuning and consideration.
Table of Contents (23 chapters)
Microsoft SQL Server 2012 Integration Services: An Expert Cookbook
About the Authors
About the Reviewers

Getting started with SSDT

This recipe is an overview of SQL Server Data Tools (SSDT), where a user will spend most of his/her time while developing and maintaining SSIS projects.

This version is based on Visual Studio 2010, and the whole structure that supports the process of developing such projects has been significantly improved. Working with SSDT is not only easier for advanced users who require more flexibility, but also for beginners who can enjoy some new and interesting user interfaces to help them take their first steps with SSDT. Previous versions of SSIS used Business Intelligence Development Studio (BIDS) as their development environment.

How to do it...

Open SQL Server Data Tools (SSDT) through the shortcut placed under Microsoft SQL Server 2012 or Open Microsoft Visual Studio 2010 under the Microsoft Visual Studio 2010 Start menu folders.

Once SSDT is open, a start page will be seen by default. The Start Page window contains useful information about the SSDT environment such as recently opened projects, links to create or open an existing project, and is also a useful area with several resources and the latest news to help stay up to date about several Microsoft platforms such as Windows, Web, Cloud, and so on.

Now that SSDT is already opened, let's create a new SSIS project from the Start Page window in order to understand the basic steps as well as the remaining windows placed in the SSIS project example.

  1. Click on New Project… and a Windows dialog will appear.

  2. Under Installed Templates, expand Business Intelligence and click on Integration Services. In the center pane, select Integration Services Project.

  3. Name the project as R02_Getting Started with SSDT. Name the solution as Ch01_Getting Start with SQL Server Integration Services in C:\SSIS and click on OK. An empty SSIS project will be created using the Project Deployment Model approach (default) with an empty package included.

  4. In the Solution Explorer pane , right-click on the SSIS Package folder, and choose Add Existing Package.

  5. In the Add Copy of Existing Package dialog box, set Package location to File System and choose the package path from the file that you saved in the previous recipe from this address: C:\SSIS\Ch01\Ch01R01_ImportExportWizard.dtsx.

  6. The new package will be added under the SSIS Packages folder, double-click on the package name in Solution Explorer to open it in Package Designer.

  7. Double-click on Preparation SQL Task 1 and the Execute SQL Task Editor dialog will open. Verify the SQL Statement property with a click on the ellipsis button in front of SQL Statement, and then close the editor.

  8. Double-click on Data Flow Task 1 , and you will be redirected to the Data Flow tab, there are three source or destination combinations in Data Flow.

  9. Double-click on the Source-Department component and the OLE DB Source Editor will open, verify the table name there.

  10. Double-click on Destination-Department, and in the OLE DB Destination Editor , verify the connection and table name.

The next recipe will explain the process of creating a new SSIS Package in more detail, and for that reason this recipe will focus on how we could get more value from SSDT to make the development and maintenance easier and faster.

How it works...

Now that the SSDT is open with an empty package, let's describe some of the windows that you should be familiar with, as shown in the next screenshot:

By default, SSDT creates a new and empty SSIS Package named package.dtsx. A package is a collection of SSIS objects including connection managers, tasks and components.

  • Package design area ( 1 )

    Control Flow is the most important tab; it's where a developer "explains" to SSIS what the package will do. The remaining tabs such as Data Flow (see recipe), Parameters (see Chapter 11, Event Handling and Logging), Event Handlers (see Chapter 10, Debugging, Troubleshooting, and Migrating Packages to 2012), the Package Explorer and Progress bar (available just at runtime) are also important and will be described in later recipes.

  • Solution Ex lorer ( 2 )

    The Solution Explorer section contains projects and their files.

    Each project consists of Project Parameters, Connection Managers, SSIS Packages, and the Miscellaneous folder.

    Project Parameters are parameters which are public for all packages in the project. We will discuss parameters in later chapters.

    The Connection Managers folder in the Solution Explorer consists of shared connection managers which are shared between all packages in a project.

    All SSIS Packages will be listed under the SSIS Packages folder.

    The Miscellaneous folder can consist of any other files that are relevant to projects and packages, files such as documentation files, screenshots, and so on.

  • Properties panel ( 3 )

    In this panel, it's possible to read and edit the properties of each selected object in the Design area or Solution Explorer.

  • SSIS Toolbox ( 4 )

    In the SSIS Package, there are tasks and components which will be available depending on the tab selected in the Package design area. When the tab selected is Control Flow, the SSIS Toolbox will be grouped into four areas. The groups of tasks are the Favorites, Common, Containers, and Other Tasks. With these tasks, it's possible to control and inform SSIS about what should be done during execution. An interesting tip is that you can add tasks to the Favorites area anytime you like by right-clicking on each task and selecting Move to Favorites.

    Note that the SSIS Toolbox is completely different on the Data Flow tab; we will talk about it in later recipes.

  • Connection Managers ( 5 )

    Connection Managers are connections from the SSIS Package's components to source or destination data providers. There are different types of connection managers, some of them which are much in use are OLE DB Connection manager, Flat File Connection Manager and so on.

    Each connection manager can be used in one or more components in the SSIS Package to work with underlying data provider. Some data providers require the installation of special drivers to have connections to their data source.

    Each connection manager which is relevant to the current package will be listed in the Connection Manager's pane. Some connections are bold, these are referenced from a shared project's connection manager.

    We will discuss more about connections in the next recipes.

  • Variables Pane ( 6 )

    Each task in SSIS Package can send information to other tasks and it is possible by resorting to Variables. Package variables, their data types, their scope, and other properties exist in this pane, which will be described in greater detail in later chapters.