Book Image

Pentaho 3.2 Data Integration: Beginner's Guide

Book Image

Pentaho 3.2 Data Integration: Beginner's Guide

Overview of this book

Pentaho Data Integration (a.k.a. Kettle) is a full-featured open source ETL (Extract, Transform, and Load) solution. Although PDI is a feature-rich tool, effectively capturing, manipulating, cleansing, transferring, and loading data can get complicated.This book is full of practical examples that will help you to take advantage of Pentaho Data Integration's graphical, drag-and-drop design environment. You will quickly get started with Pentaho Data Integration by following the step-by-step guidance in this book. The useful tips in this book will encourage you to exploit powerful features of Pentaho Data Integration and perform ETL operations with ease.Starting with the installation of the PDI software, this book will teach you all the key PDI concepts. Each chapter introduces new features, allowing you to gradually get involved with the tool. First, you will learn to work with plain files, and to do all kinds of data manipulation. Then, the book gives you a primer on databases and teaches you how to work with databases inside PDI. Not only that, you'll be given an introduction to data warehouse concepts and you will learn to load data in a data warehouse. After that, you will learn to implement simple and complex processes.Once you've learned all the basics, you will build a simple datamart that will serve to reinforce all the concepts learned through the book.
Table of Contents (27 chapters)
Pentaho 3.2 Data Integration Beginner's Guide
Credits
Foreword
The Kettle Project
About the Author
About the Reviewers
Preface
Index

Running transformations and jobs stored in files


In order to run a transformation or job stored as a .ktr / .kjb file, follow these steps:

  1. Open a terminal window.

  2. Go to the Kettle installation directory.

  3. Run the proper command according to the following table:

    Running a ...

    Windows

    Unix-like system

    transformation

    pan.bat /file:<ktr file name>

    pan.sh /file:<ktr file name>

    job

    kitchen.bat /file:<kjb file name>

    kitchen.sh /file:<kjb file name>

When specifying the .ktr/.kjb filename, you must include the full path. If the name contains spaces, surround it with double quotes.

Here are some examples:

  • Suppose that you work with Windows and that your Kettle installation directory is c:\pdi-ce. In order to execute a transformation stored in the file c:\pdi_labs\hello.ktr, you have to type the following commands:

    C:
    cd \pdi-ce
    pan.bat /file:"c:\pdi_labs\hello.ktr"
  • Suppose that you work with a Unix-like system and that your Kettle installation directory is /home/yourself/pdi-ce. In order to execute a job stored in the file /home/pdi_labs/hellojob.kjb, you have to type the following commands:

    cd /home/yourself/pdi-ce
    kitchen.sh /file:"/home/yourself/pdi-ce/hellojob.kjb"

    Note

    If you have a repository with auto login (refer Appendix A), as part of the command, add /norep. This will avoid that PDI login to the repository.