Book Image

Pentaho Data Integration Quick Start Guide

By : María Carina Roldán
Book Image

Pentaho Data Integration Quick Start Guide

By: María Carina Roldán

Overview of this book

Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag and drop design and powerful Extract-Transform-Load (ETL) capabilities. Given its power and flexibility, initial attempts to use the Pentaho Data Integration tool can be difficult or confusing. This book is the ideal solution. This book reduces your learning curve with PDI. It provides the guidance needed to make you productive, covering the main features of Pentaho Data Integration. It demonstrates the interactive features of the graphical designer, and takes you through the main ETL capabilities that the tool offers. By the end of the book, you will be able to use PDI for extracting, transforming, and loading the types of data you encounter on a daily basis.
Table of Contents (15 chapters)

Combining the execution of jobs and transformations


In real projects, you don't run isolated transformations or jobs. Instead, you combine their execution, in order to create a flow of tasks. In particular, you can run jobs or transformations from a job, and you can also iterate the execution of transformations and jobs by simulating a loop. In this section, you will learn how to implement some of these combinations. The sample jobs and transformations will be related to the datamart introduced in Chapter 5, Loading Data.

Executing transformations from a job

To demonstrate how to execute a transformation from a job, we will create a job with the following purpose: it will find out the maximum date in the injuries fact table, and then it will load the fact table by using that date to filter the data to insert.

Note

Before continuing, make sure that you delete the data inserted in the previous chapter from the Injuries fact table. This will allow you to follow the next exercises exactly as they...