Book Image

Pentaho Data Integration Quick Start Guide

By : María Carina Roldán
Book Image

Pentaho Data Integration Quick Start Guide

By: María Carina Roldán

Overview of this book

Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag and drop design and powerful Extract-Transform-Load (ETL) capabilities. Given its power and flexibility, initial attempts to use the Pentaho Data Integration tool can be difficult or confusing. This book is the ideal solution. This book reduces your learning curve with PDI. It provides the guidance needed to make you productive, covering the main features of Pentaho Data Integration. It demonstrates the interactive features of the graphical designer, and takes you through the main ETL capabilities that the tool offers. By the end of the book, you will be able to use PDI for extracting, transforming, and loading the types of data you encounter on a daily basis.
Table of Contents (15 chapters)

Inserting and updating data in database tables


The dataset that you create in PDI can be inserted into tables, or can be used to update existing data. In this section, you will learn how to perform these two kinds of operation. For the tutorials, we will use the Sports database that we used in the previous chapters.

Inserting data

In order to insert new data into a table in a relational database, PDI has a couple of steps, the Table output step being the simplest option. In this case, we will read a file with information about new injuries and insert it into the injuries_phases table by using this step. The file is available with the bundle material for this chapter, and it looks as follows:

person_id;injury_type;injury_side;injury_date
812;elbow;left;2018-05-19
813;shoulder;left;2018-05-15
119;calf;both;2018-05-20
370;wrist;;2018-05-08
241;other-excused;;2018-05-26
790;shoulder;;2018-06-30
941;knee;right;2018-07-01
151;knee;right;2018-07-11

The instructions for inserting the file are as follows...