-
Book Overview & Buying
-
Table Of Contents
Data Engineering with Python
By :
The data pipeline you build will do the following:
The final data pipeline will look like the following screenshot:
Figure 11.3 – The final version of the data pipeline
We will build the data pipeline processor group by processor group. The first processor group will read the data lake.
In the first section of this book, you read files from NiFi and will do the same here. This processor group will consist of three processors – GetFile, EvaluateJsonPath, and UpdateCounter – and an output port. Drag the processors and port to the canvas. In the following sections, you will configure them.
The GetFile processor reads files from a folder, in this case, our data lake. If you were reading a data lake in Hadoop, you would...