Until now, we have been enriching our dataset with new data. Now we will do the exact opposite: we will discard unwanted information. We already know how to keep a subset of fields and discard the rest: We do it by using the Select values
step. Now it's time to keep only the rows that we are interested on.
To demonstrate how to filter rows with PDI, we will work again with the survey files. This time, we will read a set of files, and will keep only the locations with more than three rooms. The main step we will be using is the Filter rows step. Go through the following steps:
- Create a transformation and use a
Text file input
step to read the files containing the surveys carried in 2015.
Note
You are free to read a different set of files, but if you read this set, you will be able to compare your results with the results shown in the following screenshots.
- After the
Text file input
step, add aFilter rows
step. You will find it in theFlow
folder. - In...