In this section, you will learn how to get data from plain files (for example, .txt
and CSV files). We will start by explaining how to read and configure such files, and then we will explain how PDI allows you to read multiple files at once, compressed files, and files stored in remote locations.
In the previous chapter, we experimented with reading a simple file, but this time we will go into detail on getting and properly configuring a simple file's metadata.
Note
For this and some of the future exercises in this book, we will use .csv
files with surveys of the Airbnb website. The sample data can be downloaded from http://tomslee.net/airbnb-data-collection-get-the-data.
For this exercise, we will read and configure a file with data about a survey carried out in Amsterdam. The file looks as follows:
Sample file
This time, we will use a Text file input
step, which is much more flexible than the CSV file input
that you are familiar with:
- Create a...