Book Image

SAP Data Services 4.x Cookbook

Book Image

SAP Data Services 4.x Cookbook

Overview of this book

Want to cost effectively deliver trusted information to all of your crucial business functions? SAP Data Services delivers one enterprise-class solution for data integration, data quality, data profiling, and text data processing. It boosts productivity with a single solution for data quality and data integration. SAP Data Services also enables you to move, improve, govern, and unlock big data. This book will lead you through the SAP Data Services environment to efficiently develop ETL processes. To begin with, you’ll learn to install, configure, and prepare the ETL development environment. You will get familiarized with the concepts of developing ETL processes with SAP Data Services. Starting from smallest unit of work- the data flow, the chapters will lead you to the highest organizational unit—the Data Services job, revealing the advanced techniques of ETL design. You will learn to import XML files by creating and implementing real-time jobs. It will then guide you through the ETL development patterns that enable the most effective performance when extracting, transforming, and loading data. You will also find out how to create validation functions and transforms. Finally, the book will show you the benefits of data quality management with the help of another SAP solution—Information Steward.
Table of Contents (19 chapters)
SAP Data Services 4.x Cookbook
About the Author
About the Reviewers

Preparing a database environment

This recipe will lead you through the further steps of preparing the working environment, such as preparing a database environment to be utilized by ETL processes as a source and staging and targeting systems for the migrated and transformed data.

Getting ready

To start the ETL development, we need to think about three things: the system that we will source the data from, our staging area (for initial extracts and as a preliminary storage for data during subsequent transformation steps), and finally, the data warehouse itself, to which the data will be eventually delivered.

How to do it…

Throughout the book, we will use a 64-bit environment, so ensure that you download and install the 64-bit versions of software components. Perform the following steps:

  1. Let's start by preparing our source system. For quick deployment, we will choose the Microsoft SQL Server 2012 Express database, which is available for download at

  2. Click on the Download button and select the SQLEXPRWT_x64_ENU.exe file in the list of files that are available for download. This package contains everything required for the installation and configuration of the database server: the SQL Server Express database engine and the SQL Server Management Studio tool.

  3. After the download is complete, run the executable file and follow the instructions on the screen. The installation of SQL Server 2012 Express is extremely straightforward, and all options can be set to their default values. There is no need to create any default databases during or after the installation as we will do it a bit later.

How it works…

After you have completed the installation, you should be able to run the SQL Server Management Studio application and connect to your database engine using the settings provided during the installation process.

If you have done everything correctly, you should see the "green" state of your Database Engine connection in the Object Explorer window of SQL Server Management Studio, as shown in the following screenshot:

We need an "empty" installation of MS SQL Server 2012 Express because we will create all the databases we need manually in the next steps of this chapter. This database engine installation will host all our source, stage, and target relational data structures. This option allows us to easily build a test environment that is perfect for learning purposes in order to become familiar with ETL development using SAP Data Services.

In a real-life scenario, your source databases, staging area database, and DWH database/appliance will most likely reside on separate server hosts, and they may sometimes be from different vendors. So, the role of SAP Data Services is to link them together in order to migrate data from one system to another.