Building a production data pipeline
- Read files from the data lake.
- Insert the files into staging.
- Validate the staging data.
- Move staging to the warehouse.
We will build the data pipeline processor group by processor group. The first processor group will read the data lake.
Reading the data lake
In the first section of this book, you read files from NiFi and will do the same here. This processor group will consist of three processors –
UpdateCounter – and an output port. Drag the processors and port to the canvas. In the following sections, you will configure them.