Book Image

Getting Started with Talend Open Studio for Data Integration

By : Jonathan Bowen
Book Image

Getting Started with Talend Open Studio for Data Integration

By: Jonathan Bowen

Overview of this book

Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes. "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions. TOS is a code generator and so does a lot of the "heavy lifting"ù for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks ñ transforming files and extracting data from a database, for example. These building blocks form a "toolkit"ù of techniques that you will learn how to apply in many different situations. By the end of the book, once complex integrations will appear easy and you will be your organization's integration expert! Best of all, TOS makes integrating systems fun!
Table of Contents (22 chapters)
Getting Started with Talend Open Studio for Data Integration
About the Author
About the Reviewers

Enriching data with lookups

So far, we have looked at integration scenarios where we have transformed files from one format to another, but in all cases the data we needed in the output file was contained, in some form, in the input file. However, it is commonplace in real-life scenarios that we need to transform data to the requirements of one system, but the originating system does not actually contain the data we need. It's time to improvise!

In this section, we'll create a job that passes data from one component to another, but on the way, uses a lookup data to replace some data. Imagine that we need to transform some customer data. Our original file is simple, containing the following fields:

  • Company name

  • Address

  • City

  • State

  • Zip code

The following is a sample file:

Let's name this file as corporate-addresses.csv and drop it into your DataIn folder.

The output file required by the receiving system is exactly the same as this, with one exception. Its state field is only two characters long (as...