Book Image

Pentaho Data Integration Quick Start Guide

By : María Carina Roldán
Book Image

Pentaho Data Integration Quick Start Guide

By: María Carina Roldán

Overview of this book

Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag and drop design and powerful Extract-Transform-Load (ETL) capabilities. Given its power and flexibility, initial attempts to use the Pentaho Data Integration tool can be difficult or confusing. This book is the ideal solution. This book reduces your learning curve with PDI. It provides the guidance needed to make you productive, covering the main features of Pentaho Data Integration. It demonstrates the interactive features of the graphical designer, and takes you through the main ETL capabilities that the tool offers. By the end of the book, you will be able to use PDI for extracting, transforming, and loading the types of data you encounter on a daily basis.
Table of Contents (15 chapters)

Transforming data in different ways


So far, we have seen how to create a PDI dataset mainly using data coming from files or databases. Once you have the data, there are many things you can do with it depending on your particular needs. One very common requirement is to create new fields where the values are based on the values of existent fields.

The set of operations covered in this section is not a full list of the available options, but includes the most common ones, and will inspire you when you come to implement others. 

Note

The files that we will use in this section were built with data downloaded from www.numbeo.com, a site containing information about living conditions in cities and countries worldwide.

Note

For learning the topics in this chapter, you are free to create your own data. However, if you want to reproduce the exercises exactly as they are explained, you will need the afore mentioned files from www.numbeo.com.

Note

Before continuing, make sure you download the set of data...