Book Image

Python Automation Cookbook - Second Edition

By : Jaime Buelta
Book Image

Python Automation Cookbook - Second Edition

By: Jaime Buelta

Overview of this book

In this updated and extended version of Python Automation Cookbook, each chapter now comprises the newest recipes and is revised to align with Python 3.8 and higher. The book includes three new chapters that focus on using Python for test automation, machine learning projects, and for working with messy data. This edition will enable you to develop a sharp understanding of the fundamentals required to automate business processes through real-world tasks, such as developing your first web scraping application, analyzing information to generate spreadsheet reports with graphs, and communicating with automatically generated emails. Once you grasp the basics, you will acquire the practical knowledge to create stunning graphs and charts using Matplotlib, generate rich graphics with relevant information, automate marketing campaigns, build machine learning projects, and execute debugging techniques. By the end of this book, you will be proficient in identifying monotonous tasks and resolving process inefficiencies to produce superior and reliable systems.
Table of Contents (16 chapters)
14
Other Books You May Enjoy
15
Index

Process data in parallel

The processing presented in the previous recipe works well. But it needs to process each file one by one. When we have a small number of files, this may be fine, but with huge numbers of files to handle, this will not be efficient. Each time we will be using a single CPU core, which is not the best for this type of number crunching task.

In this recipe, we will see how to process the files in parallel, making use of all the cores of the computer to speed up the process and greatly increase the throughput.

Getting ready

We will use the resulting CSV file from the previous recipe that receives and transforms logs in the following format:

[<Timestamp>] - SALE - PRODUCT: <product id> - PRICE: <price>

Each line will represent a sale log.

We will use the parse module and the delorean module. We should install the modules, adding them to our requirements.txt file as follows:

$ echo "parse==1.14.0" >>...