Pentaho Data Integration Beginner's Guide - Second Edition

Pentaho Data Integration Beginner's Guide - Second Edition - Second Edition

By : María Carina Roldán

Buy this Book

Pentaho Data Integration Beginner's Guide - Second Edition - Second Edition

By: María Carina Roldán

Buy this Book

Overview of this book

Capturing, manipulating, cleansing, transferring, and loading data effectively are the prime requirements in every IT organization. Achieving these tasks require people devoted to developing extensive software programs, or investing in ETL or data integration tools that can simplify this work. Pentaho Data Integration is a full-featured open source ETL solution that allows you to meet these requirements. Pentaho Data Integration has an intuitive, graphical, drag-and-drop design environment and its ETL capabilities are powerful. However, getting started with Pentaho Data Integration can be difficult or confusing. "Pentaho Data Integration Beginner's Guide - Second Edition" provides the guidance needed to overcome that difficulty, covering all the possible key features of Pentaho Data Integration. "Pentaho Data Integration Beginner's Guide - Second Edition" starts with the installation of Pentaho Data Integration software and then moves on to cover all the key Pentaho Data Integration concepts. Each chapter introduces new features, allowing you to gradually get involved with the tool. First, you will learn to do all kinds of data manipulation and work with plain files. Then, the book gives you a primer on databases and teaches you how to work with databases inside Pentaho Data Integration. Moreover, you will be introduced to data warehouse concepts and you will learn how to load data in a data warehouse. After that, you will learn to implement simple and complex processes. Finally, you will have the opportunity of applying and reinforcing all the learned concepts through the implementation of a simple datamart. With "Pentaho Data Integration Beginner's Guide - Second Edition", you will learn everything you need to know in order to meet your data manipulation requirements.

Pentaho Data Integration Beginner's Guide

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Getting Started with Pentaho Data Integration

Pentaho Data Integration and Pentaho BI Suite

Exploring the Pentaho Demo

Installing PDI

Time for action – installing PDI

Launching the PDI graphical designer – Spoon

Time for action – starting and customizing Spoon

PDI element	Procedure for migrating from repository to file
Single transformation or job	Open the job or transformation, select File \| Export to an XML file, browse the disk to find the folder where you want to save the job or transformation and save it. Once it has been exported, it will be available to work with the under-the-file storage method, or to import from another repository.
All transformations saved in a folder	In the Repository explorer, right-click on the name of the folder and select Export transformations. You will be asked to select the directory where the folder and all its subfolders and transformations will be exported. If you right-click on the name of the repository or the root folder in the transformation tree, you can export all the transformations.
All jobs saved in a folder	In the Repository explorer, right-click on the name of the folder and select Export Jobs. You will be asked to select the directory where the folder and all its subfolders and jobs will be exported. If you right-click on the name of the repository or the root folder in the Job tree, you can export all the jobs.
Database connections, Partition schemas, Slaves and Clusters	When exporting to an `XML` file a job or transformation that uses the database connection, the connection is exported as well (it's saved as part of the `.ktr`/`.kjb` file). The same applies to partitions, slave servers, and clusters.

Pentaho Data Integration Beginner's Guide - Second Edition - Second Edition

By : María Carina Roldán

Pentaho Data Integration Beginner's Guide - Second Edition - Second Edition

By: María Carina Roldán

Overview of this book

Related Content you might be interested in

Current Title:

Pentaho Data Integration Beginner's Guide - Second Edition - Second Edition

Migrating from file-based system to repository-based system and vice versa

Note

Note