Book Image

Learning Concurrency in Python

By : Elliot Forbes
Book Image

Learning Concurrency in Python

By: Elliot Forbes

Overview of this book

Python is a very high level, general purpose language that is utilized heavily in fields such as data science and research, as well as being one of the top choices for general purpose programming for programmers around the world. It features a wide number of powerful, high and low-level libraries and frameworks that complement its delightful syntax and enable Python programmers to create. This book introduces some of the most popular libraries and frameworks and goes in-depth into how you can leverage these libraries for your own high-concurrent, highly-performant Python programs. We'll cover the fundamental concepts of concurrency needed to be able to write your own concurrent and parallel software systems in Python. The book will guide you down the path to mastering Python concurrency, giving you all the necessary hardware and theoretical knowledge. We'll cover concepts such as debugging and exception handling as well as some of the most popular libraries and frameworks that allow you to create event-driven and reactive systems. By the end of the book, you'll have learned the techniques to write incredibly efficient concurrent systems that follow best practices.
Table of Contents (20 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
Index

Improving our crawler


Now that we've had an in-depth look at both ThreadPoolExecutors and ProcessPoolExecutors, it's time to actually put these newly learned concepts into practice. In Chapter 5, Communication between Threads, we started developing a multithreaded web crawler that was able to crawl every available link on a given website.

Note

The full source code for this Python web crawler can be found at this link: https://github.com/elliotforbes/python-crawler.

It didn't, however, output the results in the most readable format, and the code could be improved using ThreadPoolExecutors. So, let's have a look at implementing both more readable code and more readable results.

The plan

Before we get started, we need to define a general plan as to how we are going to improve our crawler.

New improvements

A few examples of the improvements we might wish to make are as follows:

  • We want to refactor our code to use ThreadPoolExecutors
  • We want to output the results of a crawl in a more readable format such...