Mastering Concurrency in Python

By : Quan Nguyen

Mastering Concurrency in Python

By: Quan Nguyen

Overview of this book

Python is one of the most popular programming languages, with numerous libraries and frameworks that facilitate high-performance computing. Concurrency and parallelism in Python are essential when it comes to multiprocessing and multithreading; they behave differently, but their common aim is to reduce the execution time. This book serves as a comprehensive introduction to various advanced concepts in concurrent engineering and programming. Mastering Concurrency in Python starts by introducing the concepts and principles in concurrency, right from Amdahl's Law to multithreading programming, followed by elucidating multiprocessing programming, web scraping, and asynchronous I/O, together with common problems that engineers and programmers face in concurrent programming. Next, the book covers a number of advanced concepts in Python concurrency and how they interact with the Python ecosystem, including the Global Interpreter Lock (GIL). Finally, you'll learn how to solve real-world concurrency problems through examples. By the end of the book, you will have gained extensive theoretical knowledge of concurrency and the ways in which concurrency is supported by the Python language

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Advanced Introduction to Concurrent and Parallel Programming

Technical requirements

What is concurrency?

Not everything should be made concurrent

The history, present, and future of concurrency

A brief overview of mastering concurrency in Python

Setting up your Python environment

Summary

Questions

Further reading

Amdahl's Law

Technical requirements

Amdahl's Law

Formula and interpretation

Amdahl's Law's relationship to the law of diminishing returns

How to simulate in Python

Practical applications of Amdahl's Law

Summary

Questions

Further reading

Working with Threads in Python

Technical requirements

The concept of a thread

An overview of the threading module

Creating a new thread in Python

Synchronizing threads

Multithreaded priority queue

Summary

Questions

Further reading

Using the with Statement in Threads

Technical requirements

Context management

The with statement in concurrent programming

Summary

Questions

Further reading

Concurrent Web Requests

Technical requirements

The basics of web requests

The requests module

Concurrent web requests

The problem of timeout

Good practices in making web requests

Summary

Questions

Further reading

Working with Processes in Python

Technical requirements

The concept of a process

An overview of the multiprocessing module

Interprocess communication

Summary

Questions

Further reading

Reduction Operators in Processes

Technical requirements

The concept of reduction operators

Example implementation in Python

Real-life applications of concurrent reduction operators

Summary

Questions

Further reading

Concurrent Image Processing

Technical requirements

Image processing fundamentals

Applying concurrency to image processing

Good concurrent image processing practices

Summary

Questions

Further reading

Introduction to Asynchronous Programming

Technical requirements

A quick analogy

Asynchronous versus other programming models

An example in Python

Summary

Questions

Further reading

Implementing Asynchronous Programming in Python

Technical requirements

The asyncio module

The asyncio framework in action

concurrent.futures as a solution for blocking tasks

Summary

Questions

Further reading

Building Communication Channels with asyncio

Technical requirements

The ecosystem of communication channels

Python example

Client-side communication with aiohttp

Summary

Questions

Further reading

Deadlocks

Technical requirements

The concept of deadlock

Approaches to deadlock situations

The concept of livelock

Summary

Questions

Further reading

Starvation

Technical requirements

The concept of starvation

The readers-writers problem

Solutions to starvation

Summary

Questions

Further reading

Race Conditions

Technical requirements

The concept of race conditions

Simulating race conditions in Python

Locks as a solution to race conditions

Race conditions in real life

Summary

Questions

Further reading

The Global Interpreter Lock

Technical requirements

An introduction to the Global Interpreter Lock

The potential removal of the GIL from Python

How to work with the GIL

Summary

Questions

Further reading

Designing Lock-Based and Mutex-Free Concurrent Data Structures

Technical requirements

Lock-based concurrent data structures in Python

Mutex-free concurrent data structures in Python

Building on simple data structures

Summary

Questions

Further reading

Memory Models and Operations on Atomic Types

Technical requirements

Python memory model

Atomic operations in Python

Summary

Questions

Further reading

Building a Server from Scratch

Technical requirements

Low-level network programming via the socket module

Building a calculator server with the socket module

Building a non-blocking server

Summary

Questions

Further reading

Testing, Debugging, and Scheduling Concurrent Applications

Technical requirements

Scheduling with APScheduler

Testing and concurrency in Python

Debugging concurrent programs

Summary

Assessments

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Not everything should be made concurrent

Not all programs are created equal: some can be made parallel or concurrent relatively easily, while others are inherently sequential, and thus cannot be executed concurrently, or in parallel. An extreme example of the former is embarrassingly parallel programs, which can be divided into different parallel tasks, between which there is little or no dependency or need for communication.

Embarrassingly parallel

A common example of an embarrassingly parallel program is the 3D video rendering handled by a graphics processing unit, where each frame or pixel can be processed with no interdependency. Password cracking is another embarrassingly parallel task that can easily be distributed on CPU cores. In a later chapter, we will tackle a number of similar problems, including image processing and web scraping, which can be made concurrent/parallel intuitively, resulting in significantly improved execution times.

Inherently sequential

In opposition to embarrassingly parallel tasks, the execution of some tasks depends heavily on the results of others. In other words, those tasks are not independent, and thus, cannot be made parallel or concurrent. Furthermore, if we were to try to implement concurrency into those programs, it could cost us more execution time to produce the same results. Let's go back to our prime-checking example from earlier; the following is the output that we saw:

> python example1.py
Result 1: [10000000000037, 10000000000051, 10000000000099, 10000000000129, 10000000000183, 10000000000259, 10000000000267, 10000000000273, 10000000000279, 10000000000283, 10000000000313, 10000000000343, 10000000000391, 10000000000411, 10000000000433, 10000000000453]
Took: 3.41 seconds.
Result 2: [10000000000183, 10000000000037, 10000000000129, 10000000000273, 10000000000259, 10000000000343, 10000000000051, 10000000000267, 10000000000279, 10000000000099, 10000000000283, 10000000000313, 10000000000391, 10000000000433, 10000000000411, 10000000000453]
Took: 2.33 seconds.

Pay close attention, and you will see that the two results from the two methods are not identical; the primes in the second result list are out of order. (Recall that, in the second method, to apply concurrency we specified splitting the tasks into different groups to be executed simultaneously, and the order of the results we obtained is the order in which each task finished being executed.) This is a direct result of using concurrency in our second method: we split the tasks to be executed by the program into different groups, and our program processed the tasks in these groups at the same time.

Since tasks across different groups were executed simultaneously, there were tasks that were behind other tasks in the input list, and yet were executed before those other tasks. For example, the number 10000000000183 was behind the number 10000000000129 in our input list, but was processed prior to, and therefore in front of, the number 10000000000129 in our output list. In fact, if you execute the program again and again, the second result will vary in almost every run.

Evidently, this situation is not desirable if the result we'd like to obtain needs to be in the order of the input we originally had. Of course, in this example, we can simply modify the result by using some form of sorting, but it will cost us extra execution time in the end, which might make it even more expensive than the original sequential approach.

A concept that is commonly used to illustrate the innate sequentiality of some tasks is pregnancy: the number of women will never reduce the length of pregnancy. As opposed to parallel or concurrent tasks, where an increase in the number of processing entities will improve the execution time, adding more processors in inherently sequential tasks will not. Famous examples of inherent sequentiality include iterative algorithms: Newton's method, iterative solutions to the three-body problem, or iterative numerical approximation methods.

Example 2 – inherently sequential tasks

Let us consider a quick example:

Computing f¹⁰⁰⁰(3), with f(x) = x² - x + 1, and f^{n + 1}(x) = f(fⁿ(x)).

With complicated functions like f (where it is relatively difficult to find a general form of fⁿ(x)), the only obviously reasonable way to compute f¹⁰⁰⁰(3) or similar values is to iteratively compute f²(3) = f( f(3)), f³(3) = f( f²(3)), ... , f⁹⁹⁹(3) = f( f⁹⁹⁸(3)), and, finally, f¹⁰⁰⁰(3) = f( f⁹⁹⁹(3)).

Since it will take significant time to actually compute f¹⁰⁰⁰(3), even when using a computer, we will only consider f²⁰(3) in our code (my laptop actually started heating up after f²⁵(3)):

# Chapter01/example2.py

def f(x):
    return x * x - x + 1

# sequential
def f(x):
    return x * x - x + 1

start = timer()
result = 3
for i in range(20):
    result = f(result)

print('Result is very large. Only printing the last 5 digits:', result % 100000)
print('Sequential took: %.2f seconds.' % (timer() - start))

Run it (or use python example2.py); the following code shows the output I received:

> python example2.py
Result is very large. Only printing the last 5 digits: 35443
Sequential took: 0.10 seconds.

Now, if we were to attempt to apply concurrency to this script, the only possible way would be through a for loop. One solution might be as follows:

# Chapter01/example2.py

# concurrent
def concurrent_f(x):
    global result
    result = f(result)

result = 3

with concurrent.futures.ThreadPoolExecutor(max_workers=20) as exector:
    futures = [exector.submit(concurrent_f, i) for i in range(20)]

    _ = concurrent.futures.as_completed(futures)

print('Result is very large. Only printing the last 5 digits:', result % 100000)
print('Concurrent took: %.2f seconds.' % (timer() - start))

The output I received is shown as follows:

> python example2.py
Result is very large. Only printing the last 5 digits: 35443
Concurrent took: 0.19 seconds.

Even though both methods produced the same result, the concurrent method took almost twice as long as the sequential method. This is due to the fact that every time a new thread (from ThreadPoolExecutor) was spawned, the function conconcurrent_f(), inside that thread, needed to wait for the variable result to be processed by the previous thread completely, and the program as a whole was thus executed in a sequential manner, nonetheless.

So, while there was no actual concurrency involved in the second method, the overhead cost of spawning new threads contributed to the significantly worse execution time. This is one example of inherently sequential tasks, where concurrency or parallelism should not be applied to attempt an improvement in execution time.

I/O bound

Another way to think about sequentiality is the concept (in computer science) of a condition called I/O bound, in which the time it takes to complete a computation is mainly determined by the time spent waiting for input/output (I/O) operations to be completed. This condition arises when the rate at which data is requested is slower than the rate at which it is consumed, or, in short, more time is spent requesting data than processing it.

In an I/O bound state, the CPU must stall its operation, waiting for data to be processed. This means that, even if the CPU gets faster at processing data, processes tend to not increase in speed in proportion to the increased CPU speed, since they get more I/O-bound. With faster computation speed being the primary goal of new computer and processor designs, I/O bound states are becoming undesirable, yet more and more common, in programs.

As you have seen, there are a number of situations in which the application of concurrent programming results in decreased processing speed, and they should thus be avoided. It is therefore important for us to not see concurrency as a golden ticket that can produce unconditionally better execution times, and to understand the differences between the structures of programs that benefit from concurrency and programs that do not.

Mastering Concurrency in Python

By : Quan Nguyen

Mastering Concurrency in Python

By: Quan Nguyen

Overview of this book

Related Content you might be interested in

Current Title:

Mastering Concurrency in Python

Learning Concurrency in Python

Python Parallel Programming Cookbook

Crafting Test-Driven Software with Python