Book Image

SAP Data Services 4.x Cookbook

Book Image

SAP Data Services 4.x Cookbook

Overview of this book

Want to cost effectively deliver trusted information to all of your crucial business functions? SAP Data Services delivers one enterprise-class solution for data integration, data quality, data profiling, and text data processing. It boosts productivity with a single solution for data quality and data integration. SAP Data Services also enables you to move, improve, govern, and unlock big data. This book will lead you through the SAP Data Services environment to efficiently develop ETL processes. To begin with, you’ll learn to install, configure, and prepare the ETL development environment. You will get familiarized with the concepts of developing ETL processes with SAP Data Services. Starting from smallest unit of work- the data flow, the chapters will lead you to the highest organizational unit—the Data Services job, revealing the advanced techniques of ETL design. You will learn to import XML files by creating and implementing real-time jobs. It will then guide you through the ETL development patterns that enable the most effective performance when extracting, transforming, and loading data. You will also find out how to create validation functions and transforms. Finally, the book will show you the benefits of data quality management with the help of another SAP solution—Information Steward.
Table of Contents (19 chapters)
SAP Data Services 4.x Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Creating a source system database


In this section, we will create our source database, which will play the role of an operational database that we will pull data from with the help of Data Services in order to transform the data and deliver it to a data warehouse.

How to do it…

Luckily for us, there are plenty of different flavors of ready-to-use databases on the Web nowadays. Let's pick one of the most popular ones: Adventure Works OLTP for SQL Server 2012, which is available for download on the CodePlex website. Perform the following steps:

  1. Use the following link to see the list of the files available for download:

    https://msftdbprodsamples.codeplex.com/releases/view/55330

  2. Click on the AdventureWorks2012 Data File link, which should download the AdventureWorks2012_Data.mdf data file.

  3. When the download is complete, copy the file into the C:\AdventureWorks\ directory (create it before copying if necessary).

The next step is to map this database file to our database engine, which will create our source database. To do this, perform the following steps:

  1. Start SQL Server Management Studio.

  2. Click on the New Query button, which will open a new session connection to a master database.

  3. In the SQL Query window, type the following command and press F5 to execute it:

    CREATE DATABASE AdventureWorks_OLTP ON 
    (FILENAME = 'C:\AdventureWorks\AdventureWorks2012_Data.mdf') 
    FOR ATTACH_REBUILD_LOG;
  4. After a successful command execution and upon refreshing the database list (using F5), you should be able to see the AdventureWorks_OLTP database in the list of the available databases in the Object Explorer window of SQL Server Management Studio.

    Tip

    Downloading the example code

    You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

How it works…

In a typical scenario, every SQL Server database consists of two data files: a database file and a transaction log file. A database file contains actual data structures and data, while a transaction log file keeps the transactional changes applied to the data.

As we only downloaded the data file, we had to execute the CREATE DATABASE command with a special ATTACH_REBUILD_LOG clause, which automatically creates a missing transaction log file so that the database could be successfully deployed and opened.

Now, our source database is ready to be used by Data Services in order to access, browse, and extract data from it.

There's more…

There are different ways to deploy test databases. This mainly depends on which RDBMS system you use. Sometimes, you may find a package of SQL scripts that contains the commands required to create all the database structures and commands used to insert data into these structures. This option may be useful if you have problems with attaching the downloaded mdf data file to your database engine or, for example, if you find the SQL scripts created for SQL Server RDBMS but have to apply them to the Oracle DB. With slight modifications to the command, you can run them in order to create an Oracle database.

Explaining RDBMS technologies lies beyond the scope of this book. So, if you are looking for more information regarding how a specific RDBMS system works, refer to the official documentation.

What has to be said here is that from the perspective of using Data Services, it does not matter which source system or target systems you use. Data Services not only supports the majority of them, but it also creates its own representation of the source and target objects; this way, they all look the same to Data Services users and abide by the same rules within the Data Services environment. So, you really do not have to be a DBA or database developer to easily connect to any RDBMS from Data Services. All that is required is a knowledge of the SQL language to understand the principle of methods that Data Services uses when extracting and loading data or creating database objects for you.