Mastering Concurrency in Python

By : Quan Nguyen

Mastering Concurrency in Python

By: Quan Nguyen

Overview of this book

Python is one of the most popular programming languages, with numerous libraries and frameworks that facilitate high-performance computing. Concurrency and parallelism in Python are essential when it comes to multiprocessing and multithreading; they behave differently, but their common aim is to reduce the execution time. This book serves as a comprehensive introduction to various advanced concepts in concurrent engineering and programming. Mastering Concurrency in Python starts by introducing the concepts and principles in concurrency, right from Amdahl's Law to multithreading programming, followed by elucidating multiprocessing programming, web scraping, and asynchronous I/O, together with common problems that engineers and programmers face in concurrent programming. Next, the book covers a number of advanced concepts in Python concurrency and how they interact with the Python ecosystem, including the Global Interpreter Lock (GIL). Finally, you'll learn how to solve real-world concurrency problems through examples. By the end of the book, you will have gained extensive theoretical knowledge of concurrency and the ways in which concurrency is supported by the Python language

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Advanced Introduction to Concurrent and Parallel Programming

Technical requirements

What is concurrency?

Not everything should be made concurrent

The history, present, and future of concurrency

A brief overview of mastering concurrency in Python

Setting up your Python environment

Summary

Questions

Further reading

Amdahl's Law

Technical requirements

Amdahl's Law

Formula and interpretation

Amdahl's Law's relationship to the law of diminishing returns

How to simulate in Python

Practical applications of Amdahl's Law

Summary

Questions

Further reading

Working with Threads in Python

Technical requirements

The concept of a thread

An overview of the threading module

Creating a new thread in Python

Synchronizing threads

Multithreaded priority queue

Summary

Questions

Further reading

Using the with Statement in Threads

Technical requirements

Context management

The with statement in concurrent programming

Summary

Questions

Further reading

Concurrent Web Requests

Technical requirements

The basics of web requests

The requests module

Concurrent web requests

The problem of timeout

Good practices in making web requests

Summary

Questions

Further reading

Working with Processes in Python

Technical requirements

The concept of a process

An overview of the multiprocessing module

Interprocess communication

Summary

Questions

Further reading

Reduction Operators in Processes

Technical requirements

The concept of reduction operators

Example implementation in Python

Real-life applications of concurrent reduction operators

Summary

Questions

Further reading

Concurrent Image Processing

Technical requirements

Image processing fundamentals

Applying concurrency to image processing

Good concurrent image processing practices

Summary

Questions

Further reading

Introduction to Asynchronous Programming

Technical requirements

A quick analogy

Asynchronous versus other programming models

An example in Python

Summary

Questions

Further reading

Implementing Asynchronous Programming in Python

Technical requirements

The asyncio module

The asyncio framework in action

concurrent.futures as a solution for blocking tasks

Summary

Questions

Further reading

Building Communication Channels with asyncio

Technical requirements

The ecosystem of communication channels

Python example

Client-side communication with aiohttp

Summary

Questions

Further reading

Deadlocks

Technical requirements

The concept of deadlock

Approaches to deadlock situations

The concept of livelock

Summary

Questions

Further reading

Starvation

Technical requirements

The concept of starvation

The readers-writers problem

Solutions to starvation

Summary

Questions

Further reading

Race Conditions

Technical requirements

The concept of race conditions

Simulating race conditions in Python

Locks as a solution to race conditions

Race conditions in real life

Summary

Questions

Further reading

The Global Interpreter Lock

Technical requirements

An introduction to the Global Interpreter Lock

The potential removal of the GIL from Python

How to work with the GIL

Summary

Questions

Further reading

Designing Lock-Based and Mutex-Free Concurrent Data Structures

Technical requirements

Lock-based concurrent data structures in Python

Mutex-free concurrent data structures in Python

Building on simple data structures

Summary

Questions

Further reading

Memory Models and Operations on Atomic Types

Technical requirements

Python memory model

Atomic operations in Python

Summary

Questions

Further reading

Building a Server from Scratch

Technical requirements

Low-level network programming via the socket module

Building a calculator server with the socket module

Building a non-blocking server

Summary

Questions

Further reading

Testing, Debugging, and Scheduling Concurrent Applications

Technical requirements

Scheduling with APScheduler

Testing and concurrency in Python

Debugging concurrent programs

Summary

Assessments

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

A brief overview of mastering concurrency in Python

Python is one of the most popular programming languages out there, and for good reason. The language comes with numerous libraries and frameworks that facilitate high-performance computing, whether it be software development, web development, data analysis, or machine learning. Yet, there have been discussions among developers criticizing Python, which often revolve around the Global Interpreter Lock (GIL) and the difficulty of implementing concurrent and parallel programs that it leads to.

While concurrency and parallelism do behave differently in Python than in other common programming languages, it is still possible for programmers to implement Python programs that run concurrently or in parallel, and achieve significant speedup for their programs.

Mastering Concurrency in Python will serve as a comprehensive introduction to various advanced concepts in concurrent engineering and programming in Python. This book will also provide a detailed overview of how concurrency and parallelism are being used in real-world applications. It is a perfect blend of theoretical analyses and practical examples, which will give you a full understanding of the theories and techniques regarding concurrent programming in Python.

This book will be divided into six main sections. It will start with the idea behind concurrency and concurrent programming—the history, how it is being used in the industry today, and finally, a mathematical analysis of the speedup that concurrency can potentially provide. Additionally, the last section in this chapter (which is our next section) will cover instructions for how to follow the coding examples in this book, including setting up a Python environment on your own computer, downloading/cloning the code included in this book from GitHub, and running each example from your computer.

The next three sections will cover three of the main implementation approaches in concurrent programming: threads, processes, and asynchronous I/O, respectively. These sections will include theoretical concepts and principles for each of these approaches, the syntax and various functionalities that the Python language provides to support them, discussions of best practices for their advanced usage, and hands-on projects that directly apply these concepts to solve real-world problems.

Section five will introduce readers to some of the most common problems that engineers and programmers face in concurrent programming: deadlock, starvation, and race conditions. Readers will learn about the theoretical foundations and causes for each problem, analyze and replicate each of them in Python, and finally implement potential solutions. The last chapter in this section will discuss the aforementioned GIL, which is specific to the Python language. It will cover the GIL's integral role in the Python ecosystem, some challenges that the GIL poses for concurrent programming, and how to implement effective workarounds.

In the last section of the book, we will be working on various advanced applications of concurrent Python programming. These applications will include the design of lock-free and lock-based concurrent data structures, memory models and operations on atomic types, and how to build a server that supports concurrent request processing from scratch. The section will also cover the the best practices when testing, debugging, and scheduling concurrent Python applications.

Throughout this book, you will be building essential skills for working with concurrent programs, just through following the discussions, the example code, and the hands-on projects. You will understand the fundamentals of the most important concepts in concurrent programming, how to implement them in Python programs, and how to apply that knowledge to advanced applications. By the end of Mastering Concurrency in Python, you will have a unique combination of extensive theoretical knowledge regarding concurrency, and practical know-how of the various applications of concurrency in the Python language.

Why Python?

As mentioned previously, one of the difficulties that developers face while working with concurrency in the Python programming language (specifically, CPython—a reference implementation of Python written in C) is its GIL. The GIL is a mutex that protects access to Python objects, preventing multiple threads from executing Python byte codes at once. This lock is necessary mainly because CPython's memory management is not thread-safe. CPython uses reference counting to implement its memory management. This results in the fact that multiple threads can access and execute Python code simultaneously; this situation is undesirable, as it can cause an incorrect handling of data, and we say that this type of memory management is not thread-safe. To address this problem, the GIL is, as the name suggests, a lock that allows only one thread to access Python code and objects. However, this also means that, to implement multithreading programs in CPython, developers need to be aware of the GIL and work around it. That is why many have problems with implementing concurrent systems in Python.

So, why use Python for concurrency at all? Even though the GIL prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations, most blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore, the GIL only becomes a potential bottleneck for multithreaded programs that spend significant time inside the GIL. As you will see in future chapters, multithreading is only a form of concurrent programming, and, while the GIL poses some challenges for multithreaded CPython programs that allow more than one thread to access shared resources, other forms of concurrent programming do not have this problem. For example, multiprocessing applications that do not share any common resources among processes, such as I/O, image processing, or NumPy number crunching, can work seamlessly with the GIL. We will discuss the GIL and its place in the Python ecosystem in greater depth in Chapter 15, The Global Interpret Lock.

Aside from that, Python has been gaining increasing popularity from the programming community. Due to its user-friendly syntax and overall readability, more and more people have found it relatively straightforward to use Python in their development, whether it is beginners learning a new programming language, intermediate users looking for the advanced functionalities of Python, or experienced programmers using Python to solve complex problems. It is estimated that the development of Python code can be up to 10 times faster than C/C++ code.

The large number of developers using Python has resulted in a strong, ever-growing support community. Libraries and packages in Python are being developed and released every day, tackling different problems and technologies. Currently, the Python language supports an incredibly wide range of programming—namely, software development, desktop GUIs, video game design, web and internet development, and scientific and numeric computing. In recent years, Python has also been growing as one of the top tools in data science, big data, and machine learning, competing with the long-time player in the field, R.

The sheer number of development tools available in Python has encouraged more developers to start programming with Python, making Python even more popular and easy to use; I call this the vicious circle of Python. David Robinson, chief data scientist at DataCamp, wrote a blog (https://stackoverflow.blog/2017/09/06/incredible-growth-python/) about the incredible growth of Python, and called it the most popular programming language.

However, Python is slow, or at least slower than other popular programming languages. This is due to the fact that Python is a dynamically typed, interpreted language, where values are stored not in dense buffers, but in scattered objects. This is a direct result of Python's readability and user-friendliness. Luckily, there are various options regarding how to make your Python program run faster, and concurrency is one of the most complex of them; that is what we are going to master throughout this book.

Mastering Concurrency in Python

By : Quan Nguyen

Mastering Concurrency in Python

By: Quan Nguyen

Overview of this book

Related Content you might be interested in

Current Title:

Mastering Concurrency in Python

Learning Concurrency in Python

Python Parallel Programming Cookbook

Crafting Test-Driven Software with Python