Book Image

Getting Started with Talend Open Studio for Data Integration

By : Jonathan Bowen
Book Image

Getting Started with Talend Open Studio for Data Integration

By: Jonathan Bowen

Overview of this book

Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes. "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions. TOS is a code generator and so does a lot of the "heavy lifting"ù for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks ñ transforming files and extracting data from a database, for example. These building blocks form a "toolkit"ù of techniques that you will learn how to apply in many different situations. By the end of the book, once complex integrations will appear easy and you will be your organization's integration expert! Best of all, TOS makes integrating systems fun!
Table of Contents (22 chapters)
Getting Started with Talend Open Studio for Data Integration
Credits
Foreword
Foreword
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Filtering data


As we pass data through an integration process, we may often wish to filter it in some way. Data from source systems may be fine in terms of its format, but its content scope may be too broad for the receiving systems. For example, suppose we have an export of data from our financial system of all invoices due to our customers and we wish to send a list of the invoices to each customer; we wouldn't send the full list to all customers, but rather send a filtered list to each customer of only their own invoices.

We have seen in previous examples that the tMap component has filtering capabilities but the Studio provides a dedicated filtering component with some extra features for fine control (when there is no requirement for data mapping). We will look at three examples of how to use the filter component in your integration jobs:

  • A straightforward filter

  • The same filter, but also capturing the rejected records

  • Finally, how to split a file based on filters

Simple filter

Let's start...