Book Image

Data Ingestion with Python Cookbook

By : Gláucia Esppenchutz
Book Image

Data Ingestion with Python Cookbook

By: Gláucia Esppenchutz

Overview of this book

Data Ingestion with Python Cookbook offers a practical approach to designing and implementing data ingestion pipelines. It presents real-world examples with the most widely recognized open source tools on the market to answer commonly asked questions and overcome challenges. You’ll be introduced to designing and working with or without data schemas, as well as creating monitored pipelines with Airflow and data observability principles, all while following industry best practices. The book also addresses challenges associated with reading different data sources and data formats. As you progress through the book, you’ll gain a broader understanding of error logging best practices, troubleshooting techniques, data orchestration, monitoring, and storing logs for further consultation. By the end of the book, you’ll have a fully automated set that enables you to start ingesting and monitoring your data pipeline effortlessly, facilitating seamless integration with subsequent stages of the ETL process.
Table of Contents (17 chapters)
1
Part 1: Fundamentals of Data Ingestion
9
Part 2: Structuring the Ingestion Pipeline

Creating standardized logs

Now that we know the best practices for inserting logs and using log levels, we can add more relevant information to our logs to help us monitor our code. Information such as date and time or the module or function executed helps us determine where an issue occurred or where improvements are required.

Creating standardized formatting for application logs or (in our case) data pipeline logs makes the debugging process more manageable, and there are a variety of ways to do this. One way of doing it is to create .ini or .conf files that hold the configuration on how the logs will be formatted and applied to our wider Python code, for instance.

In this recipe, we will learn how to create a configuration file that will dictate how the logs will be formatted across the code and shown in the execution output.

Getting ready

Let’s use the same code as the previous Using log-level types recipe, but with more improvements!

You can use the following...