Verifying curated data in the silver layer
In the previous section, we ran electroniz_curation_pipeline
four times, each time with a different hourly folder. If everything worked correctly, we could safely infer that the silver layer of the Electroniz lakehouse is now functional.
As per the curation process, the last step is to verify the curated data. We will use another notebook that already contains the code that performs the validations. You simply need to run it step by step as follows:
- The verification code is available in
curation_verification_notebook.ipynb
. - Import the
curation_verification_notebook.ipynb
notebook into Azure Databricks. The steps are very similar to what was done previously for the curation notebook (electroniz_curation_notebook.ipynb
):- On the Databricks workspace, click on Workspace. Then, click on Users.
- Click on the arrow beside your username and click on Import.
- Choose URL.
- Use the following URL: https://github.com/PacktPublishing/Data-Engineering...