Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying SAP Data Services 4.x Cookbook
  • Table Of Contents Toc
SAP Data Services 4.x Cookbook

SAP Data Services 4.x Cookbook

By : Shomnikov
4 (1)
close
close
SAP Data Services 4.x Cookbook

SAP Data Services 4.x Cookbook

4 (1)
By: Shomnikov

Overview of this book

Want to cost effectively deliver trusted information to all of your crucial business functions? SAP Data Services delivers one enterprise-class solution for data integration, data quality, data profiling, and text data processing. It boosts productivity with a single solution for data quality and data integration. SAP Data Services also enables you to move, improve, govern, and unlock big data. This book will lead you through the SAP Data Services environment to efficiently develop ETL processes. To begin with, you’ll learn to install, configure, and prepare the ETL development environment. You will get familiarized with the concepts of developing ETL processes with SAP Data Services. Starting from smallest unit of work- the data flow, the chapters will lead you to the highest organizational unit—the Data Services job, revealing the advanced techniques of ETL design. You will learn to import XML files by creating and implementing real-time jobs. It will then guide you through the ETL development patterns that enable the most effective performance when extracting, transforming, and loading data. You will also find out how to create validation functions and transforms. Finally, the book will show you the benefits of data quality management with the help of another SAP solution—Information Steward.
Table of Contents (14 chapters)
close
close
11
11. Working with SAP Applications
13
Index

Defining and creating staging area structures

In this recipe, we will talk about ETL data structures that will be used in this book. Staging structures are important storage areas where extracted data is kept before it gets transformed or stored between the transformation steps. The staging area in general can be used to create backup copies of data or to run analytical queries on the data in order to validate the transformations made or the extract processes. Staging data structures can be quite different, as you will see. Which one to use depends on the tasks you are trying to accomplish, your project requirements, and the architecture of the environment used.

How to do it…

The most popular data structures that could be used in the staging area are flat files and RDBMS tables.

Flat files

One of the perks of using Data Services against the handcoded ETL solution is that Data Services allows you to easily read from and write information to a flat file.

Create the C:\AW\ folder, which will be used throughout this book to store flat files.

Note

Inserting data into a flat file is faster than inserting data into an RDBMS table. So, during ETL development, flat files are often used to reach two goals simultaneously: creating a backup copy of the data snapshot and providing you with the storage location for your preliminary data before you apply the next set of transformation rules.

Another common use of flat files is the ability to exchange data between systems that cannot communicate with each other in any other way.

Lastly, it is very cost-effective to store flat files (OS disk storage space is cheaper than DB storage space).

The main disadvantage of the flat files storage method is that the modification of data in a flat file can sometimes be a real pain, not to mention that it is much slower than modifying data in a relational DB table.

RDBMS tables

These ETL data structures will be used more often than others to stage the data that is going through the ETL transformation process.

Let's create two separate databases for relational tables, which will play the role of the ETL staging area in our future examples:

  1. Open SQL Server Management Studio.
  2. Right-click on the Databases icon and select the New Database… option.
  3. On the next screen, input ODS as the database name, and specify 100 MB as the initial size value of the database file and 10 MB as that of the transactional log file:
    RDBMS tables
  4. Repeat the last two steps to create another dataset called STAGE.

How it works…

Let's recap. The ETL staging area is a location to store the preliminary results of our ETL transformations and also a landing zone for the extracts from the source system.

Yes, Data Services allows you to extract data and perform all transformations in the memory before loading to the target system. However, as you will see in later chapters, the ETL process, which does everything in one "go", can be complex and difficult to maintain. Plus, if something goes wrong along the way, all the changes that the process has already performed will be lost and you may have to start the extraction/transformation process again. This obviously creates extra workload on a source system because you have to query it again in order to get the data. Finally, big does not mean effective. We will show you how splitting your ETL process into smaller pieces helps you to create a well-performing sequence of dataflow.

The ODS database will be used as a landing zone for the data coming from source systems. The structure of the tables here will be identical to the structure of the source system tables.

The STAGE database will hold the relational tables used to store data between the data transformation steps.

We will also store some data extracted from a source database in a flat file format to demonstrate the ability of Data Services to work with them and show the convenience of this data storage method in the ETL system.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
SAP Data Services 4.x Cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon