-
Book Overview & Buying
-
Table Of Contents
Building ETL Pipelines with Python
By :
When it comes to scalability, the orchestration of the data pipeline takes precedence. In the previous chapter, we introduced how CI/CD and design strategies can be leveraged to maintain data integrity and smooth pipeline deployments with external tools. In this chapter, we will explore how to orchestrate your ETL pipelines as the complexity and size of your data grows.
We’ll explore important metrics for tracking your pipelines’ health, such as latency, error rates, and data quality indicators, as well as various logging strategies that empower you to create a pipeline that is not only robust, but also easy to debug when errors inevitably arise in the future.
Specifically, this chapter will go through the following: