In the previous section, we learned how to work with individual fields—for example, by creating new ones or modifying existent ones. The operations were applied row by row. In this section, we will not look at individual rows, but we will instead learn to observe and work on the dataset as a unit.
Sorting the dataset is a very useful and common task. Sorting is really easy to do in PDI, and we will demonstrate it with a simple transformation. We will take the files of the surveys that we used in the previous chapter, and we will sort the data by neighborhood
and room_type
columns, and then by the reviews
column in descending order. In order to do this, go through the following steps:
- Open any of the transformations created in the last chapter that read files with surveys. Save the transformation with a different name.
- Drag a
Sort rows
step from theTransform
folder and create a hop from theText file input
toward this new step. - Double-click the step and...