Getting Started with Talend Open Studio for Data Integration

Getting Started with Talend Open Studio for Data Integration

By : Jonathan Bowen

Buy this Book

Getting Started with Talend Open Studio for Data Integration

By: Jonathan Bowen

Buy this Book

Overview of this book

Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes. "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions. TOS is a code generator and so does a lot of the "heavy lifting"ù for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks ñ transforming files and extracting data from a database, for example. These building blocks form a "toolkit"ù of techniques that you will learn how to apply in many different situations. By the end of the book, once complex integrations will appear easy and you will be your organization's integration expert! Best of all, TOS makes integrating systems fun!

Getting Started with Talend Open Studio for Data Integration

Credits

Foreword

About the Author

Acknowledgement

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Knowing Talend Open Studio

What Talend Open Studio is

Installing Talend Open Studio

Other useful software

Sample jobs and data

Summary

Working with Talend Open Studio

Studio definitions

Starting the Studio

Tour of the Studio

Creating a new project

Creating an example job

Metadata

Summary

Transforming Files

Transforming XML to CSV

Transforming CSV to XML

Maps and expressions

Advanced XML output for complex XML structures

Working with multi-schema XML files

Enriching data with lookups

Extracting data from Excel files

Summary

Working with Databases

Database metadata

Extracting data from a database

Extracts from multiple tables

Writing data to a database

Database to database transfer

Modifying data in a database

Dynamic database lookup

Summary

Filtering, Sorting, and Other Processing Techniques

Filtering data

Sorting data

Aggregating data

Normalizing and denormalizing data

Extracting delimited fields

Find and replace

Sampling rows

Summary

Managing Files

Managing local files

FTP file operations

Summary

Job Orchestration

Run If

Iterating and looping

Duplicating and merging dataflows

Summary

Managing Jobs

Job versions

Exporting and importing jobs

Scheduling jobs

Summary

Global Variables and Contexts

Global variables

Contexts

Summary

Worked Examples

Product catalog

Product inventory data

Order file processing

Order status updates

Automating processes

Summary

Installing Sample Jobs and Data

Downloading job and data files

Resources

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Extracting delimited fields

As we have seen, some systems may store data in a denormalized form and, in the previous section, we saw how we could normalize the data. In essence, we were turning the data from column into a row. However, with some data, we may wish to change its normalized form not to rows, but to individual columns. For example, suppose a system stores its employee data with the following schema:

[employee_id] | [name]

And the name field holds the first name and last name of the employee in the following format:

[last_name], [first_name]

An example file is shown as follows:

Note

Note that the schema does not have three fields, but that the second field contains the first and last name, separated by a comma.

Our objective in this example is to manipulate the data so that it maps to a three-field schema:

[employee_id] | [last_name] | [first_name]

Follow the walk-through given:

Create a new job and name it ExtractDelimitedFields.
Create a File delimited metadata item for our input file...

Getting Started with Talend Open Studio for Data Integration

By : Jonathan Bowen

Getting Started with Talend Open Studio for Data Integration

By: Jonathan Bowen

Overview of this book

Related Content you might be interested in

Current Title:

Getting Started with Talend Open Studio for Data Integration

Extracting delimited fields

Note