Book Image

Python Parallel Programming Cookbook

By : Zaccone
Book Image

Python Parallel Programming Cookbook

By: Zaccone

Overview of this book

This book will teach you parallel programming techniques using examples in Python and will help you explore the many ways in which you can write code that allows more than one process to happen at once. Starting with introducing you to the world of parallel computing, it moves on to cover the fundamentals in Python. This is followed by exploring the thread-based parallelism model using the Python threading module by synchronizing threads and using locks, mutex, semaphores queues, GIL, and the thread pool. Next you will be taught about process-based parallelism where you will synchronize processes using message passing along with learning about the performance of MPI Python Modules. You will then go on to learn the asynchronous parallel programming model using the Python asyncio module along with handling exceptions. Moving on, you will discover distributed computing with Python, and learn how to install a broker, use Celery Python Module, and create a worker. You will understand anche Pycsp, the Scoop framework, and disk modules in Python. Further on, you will learnGPU programming withPython using the PyCUDA module along with evaluating performance limitations.
Table of Contents (8 chapters)
7
Index

Kernel invocations with GPUArray


In the previous recipe, we saw how to invoke a kernel function using the class:

pycuda.compiler.SourceModule(kernel_source, nvcc="nvcc", options=None, other_options)

It creates a module from the CUDA source code called kernel_source. Then, the NVIDIA nvcc compiler is invoked with options to compile the code.

However, PyCUDA introduces the class pycuda.gpuarray.GPUArray that provides a high-level interface to perform calculations with CUDA:

class pycuda.gpuarray.GPUArray(shape, dtype, *, allocator=None, order="C")

This works in a similar way to numpy.ndarray, which stores its data and performs its computations on the compute device. The shape and dtype arguments work exactly as in NumPy.

All the arithmetic methods in GPUArray support the broadcasting of scalars. The creation of gpuarray is quite easy. One way is to create a NumPy array and convert it, as shown in the following code:

>>> import pycuda.gpuarray as gpuarray
>>> from numpy.random import...