Book Image

Instant Pentaho Data Integration Kitchen

By : Sergio Ramazzina
Book Image

Instant Pentaho Data Integration Kitchen

By: Sergio Ramazzina

Overview of this book

Pentaho PDI is a modern, powerful, and easy-to-use ETL system that lets you develop ETL processes with simplicity. Explore and gain the experience and skills that you need to run processes from the command line or schedule them by using an extensive description and a good set of samples. Instant Pentaho Data Integration Kitchen How-to will help you to understand the correct way to deal with PDI command line tools. We start with a recipe about how to configure your memory requirements to run your processes effectively and then move forward with a set of recipes that show you the different ways to start PDI processes. We start with a recap about how transformations and jobs are designed using spoon and then move forward to configure memory requirements to properly run your processes from the command line. We dive into the various flags that control the logging system by specifying the logging output and the log verbosity. We focus and deliver all the knowledge you require to run the ETL processes using command line tools with ease and in a proficient manner.
Table of Contents (7 chapters)

Managing PDI processes return code (Simple)


This recipe covers an important aspect related to getting the return code that was obtained from the execution of our jobs. It gives you some advice on how to get it so that it could be used to determine if everything is going as expected. This recipe will work the same for both Kitchen and Pan; the only difference is in the name of the script's file used to start the process.

Getting ready

To get ready for this recipe, you need to check that the JAVA_HOME environment variable is set properly and then configure your environment variables so that the Kitchen script can start from anywhere without specifying the complete path to your PDI home directory. For details about these checks, refer to the recipe Executing PDI jobs from a filesystem (Simple).

How to do it...

  1. Every time you start a PDI process, either jobs or transformations, the script that started the job or the transformation gets back from PDI with a return code that gives an indication about whether the process terminated successfully or not.

  2. In case the process terminated unsuccessfully, the code gives you an overall indication of what happened.

  3. Looking on the Internet at the Pentaho wiki and at some blogs (an interesting article was published recently on this topic in Diethard Steiner's blog at http://diethardsteiner.blogspot.it/2013/03/pentaho-kettle-pdi-get-pan-and-kitchen.html), you can easily find a summary table that describes these error codes and their meaning, which we have described for reference purposes in the There's more… section.

  4. To display this code on Linux/Mac, edit the kitchen.sh script and add the following command at the end of the PDI script. This command returns the exit code of the last called process:

    exit?1
  5. To display this code on a Windows platform, edit the Kitchen.bat script and add the following command at the end of the PDI script. This command returns the exit code of the last called process:

    echo %ERRORLEVEL%
  6. You can do the same with the Pan scripts. As soon as the script terminates, it displays the error code. You can try it out by adding it to your scripts and then calling one of your sample jobs or transformations. Getting this code is a very interesting thing because as soon as you call Kitchen or Pan scripts from inside another script, the caller is able to take action in case something goes wrong. This means that we can design an error handling strategy.

There's more...

It will be interesting to have a look at the summary of all the exit codes.

A summary of Kitchen/Pan exit codes

The following table summarizes all the exit codes with a brief explanation of their meanings:

Code

Description

0

The job/transformation ran without a problem.

1

An error occurred during processing.

2

An unexpected error occurred during loading/running of the job/transformation. Basically, it can be an error in the XML format, an error in reading the file, or it can denote that there are problems with the repository connection.

3

Unable to connect to a database, open a file, or other initialization errors.

7

The job/transformation couldn't be loaded from XML or the repository; basically, it could be that one of the plugins in the plugins/ folder is not written correctly or is incompatible.

8

An error occurred while loading steps or plugins (an error in loading one of the plugins mostly).

9

Command line usage printing.