Book Image

Pentaho Data Integration Quick Start Guide

By : María Carina Roldán
Book Image

Pentaho Data Integration Quick Start Guide

By: María Carina Roldán

Overview of this book

Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag and drop design and powerful Extract-Transform-Load (ETL) capabilities. Given its power and flexibility, initial attempts to use the Pentaho Data Integration tool can be difficult or confusing. This book is the ideal solution. This book reduces your learning curve with PDI. It provides the guidance needed to make you productive, covering the main features of Pentaho Data Integration. It demonstrates the interactive features of the graphical designer, and takes you through the main ETL capabilities that the tool offers. By the end of the book, you will be able to use PDI for extracting, transforming, and loading the types of data you encounter on a daily basis.
Table of Contents (15 chapters)

Installing PDI

The following are the instructions to install the PDI Community Edition (CE), irrespective of the operating system that you may be using:

  • Make sure that you have JRE 8.0 installed.


If you don't have JRE 8.0 installed, download it from Redash source code by cloning the repository, and install it before proceeding. Make sure that the JAVA_HOME system variable is set.

PDI on

  • Download the available ZIP file, which will serve you for all platforms.
  • Unzip the downloaded file in a folder of your choice (for example, c:/software/pdi or /home/pdi_user/pdi).
  • Browse your disk and look for the PDI folder that was just created. You will see a folder named data-integration, with several subfolders (lib, plugins, samples, and more) and a bunch of scripts (spoon.bat, pan.bat, and others), which we will soon learn how to use.