Book Image

Distributed Computing with Python

Book Image

Distributed Computing with Python

Overview of this book

CPU-intensive data processing tasks have become crucial considering the complexity of the various big data applications that are used today. Reducing the CPU utilization per process is very important to improve the overall speed of applications. This book will teach you how to perform parallel execution of computations by distributing them across multiple processors in a single machine, thus improving the overall performance of a big data processing task. We will cover synchronous and asynchronous models, shared memory and file systems, communication between various processes, synchronization, and more.
Table of Contents (15 chapters)
Distributed Computing with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

The cloud and the HPC world


Chapter 5, Python in the Cloud, gave you a quick tour of the cloud in general and Amazon Web Services in particular. This is a hot topic nowadays, and the reason for this is simple: with relatively little upfront investment and virtually no wait, one can rent a few virtual machines together with, optionally, a database server and a data store. If the application needs more power, one can simply scale up the underlying infrastructure with the press of a button (and the swipe of a credit card).

Things, unfortunately, are never as simple as vendor brochures like to depict, especially when outsourcing a critical piece of infrastructure to a third party whose interests might not be perfectly aligned with ours.

A solid piece of advice is to always plan for the worst and keep automatic backups of the whole application and its software stack locally (or at the very least, on a separate provider). Ideally (but not that practically), one would have a scaled-down, but up...