Book Image

Distributed Computing with Python

Book Image

Distributed Computing with Python

Overview of this book

CPU-intensive data processing tasks have become crucial considering the complexity of the various big data applications that are used today. Reducing the CPU utilization per process is very important to improve the overall speed of applications. This book will teach you how to perform parallel execution of computations by distributing them across multiple processors in a single machine, thus improving the overall performance of a big data processing task. We will cover synchronous and asynchronous models, shared memory and file systems, communication between various processes, synchronization, and more.
Table of Contents (15 chapters)
Distributed Computing with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Chapter 6. Python on an HPC Cluster

In this chapter, we are going to look at yet another way of deploying our distributed Python applications. The method we are going to look at is the use of a High Performance Computing (HPC) cluster (also called a supercomputer), such as the multimillion dollar (or Euros), room-filling machines commonly found in evil overlord lairs.

Real HPC clusters are typically found in universities and national labs and are not what a typical start-up or small company would operate, mostly due to their cost. They are generally quite large systems with possibly hundreds of thousands of CPU cores spread over thousands of machines.

Oftentimes, the maximum size of HPC clusters a center can afford is determined by the amount of electricity that can be delivered to the premises; HPC systems that use a few megawatts are not at all uncommon. The system I work with, for instance, has over 160,000 cores across 7,000 nodes and consumes about 4 megawatts of electricity!

Developers...