Book Image

Distributed Computing with Python

Book Image

Distributed Computing with Python

Overview of this book

CPU-intensive data processing tasks have become crucial considering the complexity of the various big data applications that are used today. Reducing the CPU utilization per process is very important to improve the overall speed of applications. This book will teach you how to perform parallel execution of computations by distributing them across multiple processors in a single machine, thus improving the overall performance of a big data processing task. We will cover synchronous and asynchronous models, shared memory and file systems, communication between various processes, synchronization, and more.
Table of Contents (15 chapters)
Distributed Computing with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Where to go next


Building small- to medium-sized distributed applications in Python, as we saw, is not particularly difficult. Once a distributed system grows to a larger size, the design and development effort needed tends to grow as well in a super-linear fashion.

In these cases, a more solid foundation on the theory of distributed systems becomes necessary. There are a number of resources available both online and offline. Most big universities give courses on this subject, and a number of them are freely available online.

One good example is the ETH course on Principles of Distributed Computing (http://dcg.ethz.ch/lectures/podc_allstars/index.html), which covers a number of fundamentals, including synchronization, consensus, and eventual consistency (including the famous CAP theorem).

Having said that, beginners should not feel discouraged. The gain in performance that even a few lines of code in a simple framework, such as Python-RQ, can give to our code is just astounding!