Book Image

Python Parallel Programming Cookbook

By : Giancarlo Zaccone
Book Image

Python Parallel Programming Cookbook

By: Giancarlo Zaccone

Overview of this book

This book will teach you parallel programming techniques using examples in Python and will help you explore the many ways in which you can write code that allows more than one process to happen at once. Starting with introducing you to the world of parallel computing, it moves on to cover the fundamentals in Python. This is followed by exploring the thread-based parallelism model using the Python threading module by synchronizing threads and using locks, mutex, semaphores queues, GIL, and the thread pool. Next you will be taught about process-based parallelism where you will synchronize processes using message passing along with learning about the performance of MPI Python Modules. You will then go on to learn the asynchronous parallel programming model using the Python asyncio module along with handling exceptions. Moving on, you will discover distributed computing with Python, and learn how to install a broker, use Celery Python Module, and create a worker. You will understand anche Pycsp, the Scoop framework, and disk modules in Python. Further on, you will learnGPU programming withPython using the PyCUDA module along with evaluating performance limitations.
Table of Contents (13 chapters)
Python Parallel Programming Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Preface

The study of computer science should cover not only the principles on which computational processing is based, but should also reflect the current state of knowledge of these fields. Today, the technology requires that professionals from all branches of computer science know both the software and hardware whose interaction at all levels is the key to understanding the basics of computational processing.

For this reason, in this book, a special focus is given on the relationship between hardware architectures and software.

Until recently, programmers could rely on the work of the hardware designers, compilers, and chip manufacturers to make their software programs faster or more efficient without the need for changes.

This era is over. So now, if a program is to run faster, it must become a parallel program.

Although the goal of many researchers is to ensure that programmers are not aware of the parallel nature of the hardware for which they write their programs, it will take many years before this actually becomes possible. Nowadays, most programmers need to thoroughly understand the link between hardware and software so that the programs can be run efficiently on modern computer architectures.

To introduce the concepts of parallel programming, the Python programming language has been adopted. Python is fun and easy to use, and its popularity has grown steadily in recent years. Python was developed more than 10 years ago by Guido van Rossum, who derived Python's syntax simplicity and ease of use largely from ABC, which is a teaching language that was developed in the 80s.

In addition to this specific context, Python was created to solve real-life problems, and it borrows a wide variety of typical characteristics of programming languages, such as C ++, Java, and Scheme. This is one of its most remarkable features, which has led to its broad appeal among professional software developers, the scientific research industry, and computer science educators. One of the reasons why Python is liked so much is because it provides the best balance between the practical and conceptual approaches. It is an interpreted language, so you can start doing things immediately without getting lost in the problems of compilation and linking. Python also provides an extensive software library that can be used in all sorts of tasks ranging from the Web, graphics, and of course, parallel computing. This practical aspect is a great way to engage readers and allow them to carry out projects that are important in this book.

This book contains a wide variety of examples that are inspired by many situations, and these offer you the opportunity to solve real-life problems. This book examines the principles of software design for parallel architectures, insisting on the importance of clarity of the programs and avoiding the use of complex terminology in favor of clear and direct examples. Each topic is presented as part of a complete, working Python program, which is followed by the output of the program in question.

The modular organization of the various chapters provides a proven path to move from the simplest arguments to the most advanced ones, but this is also suitable for those who only want to learn a few specific issues.

I hope that the settings and content of this book are able to provide you with a useful contribution for your better understanding and dissemination of parallel programming techniques.

What this book covers

Chapter 1, Getting Started with Parallel Computing and Python, gives you an overview of parallel programming architectures and programming models. This chapter introduces the Python programming language, the characteristics of the language, its ease of use and learning, extensibility, and richness of software libraries and applications. It also shows you how to make Python a valuable tool for any application, and also, of course, for parallel computing.

Chapter 2, Thread-based Parallelism, discusses thread parallelism using the threading Python module. Through complete programming examples, you will learn how to synchronize and manipulate threads to implement your multithreading applications.

Chapter 3, Process-based Parallelism, will guide through the process-based approach to parallelize a program. A complete set of examples will show you how to use the multiprocessing Python module. Also, this chapter will explain how to perform communication through processes, using the message passing parallel programming paradigm via the mpi4py Python module.

Chapter 4, Asynchronous Programming, explains the asynchronous model for concurrent programming. In some ways, it is simpler than the threaded one because there is a single instruction stream and tasks explicitly relinquish control instead of being suspended arbitrarily. This chapter will show you how to use the Python asyncio module to organize each task as a sequence of smaller steps that must be executed in an asynchronous manner.

Chapter 5, Distributed Python, introduces you to distributed computing. It is the process of aggregating several computing units logically and may even be geographically distributed to collaboratively run a single computational task in a transparent and coherent way. This chapter will present some of the solutions proposed by Python for the implementation of these architectures using the OO approach, Celery, SCOOP, and remote procedure calls, such as Pyro4 and RPyC. It will also include different approaches, such as PyCSP, and finally, Disco, which is the Python version of the MapReduce algorithm.

Chapter 6, GPU Programming with Python, describes the modern Graphics Processing Units (GPUs) that provide breakthrough performance for numerical computing at the cost of increased programming complexity. In fact, the programming models for GPUs require the programmer to manually manage the data transfer between a CPU and GPU. This chapter will teach you, through the programming examples and use cases, how to exploit the computing power provided by the GPU cards, using the powerful Python modules: PyCUDA, NumbaPro, and PyOpenlCL.

What you need for this book

All the examples of this book can be tested in a Windows 7 32-bit machine. Also, a Linux environment will be useful.

The Python versions needed to run the examples are:

  • Python 3.3 (for the first five chapters)

  • Python 2.7 (only for Chapter 6, GPU Programming with Python)

The following modules (all of which are freely downloadable) are required:

  • mpich-3.1.4

  • pip 6.1.1

  • mpi4py1.3.1

  • asyncio 3.4.3

  • Celery 3.1.18

  • Numpy 1.9.2

  • Flower 0.8.32 (optional)

  • SCOOP 0.7.2

  • Pyro 4.4.36

  • PyCSP 0.9.0

  • DISCO 0.5.2

  • RPyC 3.3.0

  • PyCUDA 2015.1.2

  • CUDA Toolkit 4.2.9 (at least)

  • NVIDIA GPU SDK 4.2.9 (at least)

  • NVIDIA GPU driver

  • Microsoft Visual Studio 2008 C++ Express Edition (at least)

  • Anaconda Python Distribution

  • NumbaPro compiler

  • PyOpenCL 2015.1

  • Win32 OpenCL Driver 15.1 (at least)

Who this book is for

This book is intended for software developers who want to use parallel programming techniques to write powerful and efficient code. After reading this book, you will be able to master the basics and the advanced features of parallel computing. The Python programming language is easy to use and allows nonexperts to deal with and easily understand the topics exposed in this book.

Sections

This book contains the following sections:

Getting ready

This section tells us what to expect in the recipe and describes how to set up any software or any preliminary settings needed for the recipe.

How to do it…

This section characterizes the steps that are to be followed to "cook" the recipe.

How it works…

This section usually consists a brief and detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make the reader more anxious about the recipe.

See also

This section may contain references to the recipe.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "To execute this first example, we need the program helloPythonWithThreads.py."

A block of code is set as follows:

print ("Hello Python Parallel Cookbook!!")
closeInput = raw_input("Press ENTER to exit")
print "Closing calledProcess"

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

@asyncio.coroutine
def factorial(number):
do Something

@asyncio.coroutine

Any command-line input or output is written as follows:

C:\>mpiexec -n 4 python virtualTopology.py

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "Open an admin Command Prompt by right-clicking on the command prompt icon and select Run as administrator."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to , and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at if you are having a problem with any aspect of the book, and we will do our best to address it.