Book Image

Getting Started with Talend Open Studio for Data Integration

By : Jonathan Bowen
Book Image

Getting Started with Talend Open Studio for Data Integration

By: Jonathan Bowen

Overview of this book

Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes. "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions. TOS is a code generator and so does a lot of the "heavy lifting"ù for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks ñ transforming files and extracting data from a database, for example. These building blocks form a "toolkit"ù of techniques that you will learn how to apply in many different situations. By the end of the book, once complex integrations will appear easy and you will be your organization's integration expert! Best of all, TOS makes integrating systems fun!
Table of Contents (22 chapters)
Getting Started with Talend Open Studio for Data Integration
Credits
Foreword
Foreword
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Maps and expressions


In most integration scenarios, we are unlikely to find that all data fields can be passed from one system to another without any modification. Because different systems model the same objects in different ways, there's often the need, to not only change the file format, but also to change the data model and content in some way.

For our next job design, we'll do another CSV to XML transformation; but this time, the data models of the input and the output (and hence the schemas) will be different. We'll use the Studio's mapping component and Expression editor to help us deal with these differences.

To start off, let's look at our two data models to examine the differences. Our CSV file is a customer datafile and has the following fields:

  • Customer ID

  • First Name

  • Last Name

  • Address1

  • Address2

  • Town City

  • County

  • Postcode

  • Telephone

We know that all of these fields have a string data type and that all fields are mandatory, except for Address2.

The XML file we want to produce has a similar,...