Book Image

Python High Performance, Second Edition - Second Edition

By : Dr. Gabriele Lanaro
Book Image

Python High Performance, Second Edition - Second Edition

By: Dr. Gabriele Lanaro

Overview of this book

Python is a versatile language that has found applications in many industries. The clean syntax, rich standard library, and vast selection of third-party libraries make Python a wildly popular language. Python High Performance is a practical guide that shows how to leverage the power of both native and third-party Python libraries to build robust applications. The book explains how to use various profilers to find performance bottlenecks and apply the correct algorithm to fix them. The reader will learn how to effectively use NumPy and Cython to speed up numerical code. The book explains concepts of concurrent programming and how to implement robust and responsive applications using Reactive programming. Readers will learn how to write code for parallel architectures using Tensorflow and Theano, and use a cluster of computers for large-scale computations using technologies such as Dask and PySpark. By the end of the book, readers will have learned to achieve performance and scale from their Python applications.
Table of Contents (10 chapters)

Summary

Distributed processing can be used to implement algorithms capable of handling massive datasets by distributing smaller tasks across a cluster of computers. Over the years, many software packages, such as Apache Hadoop, have been developed to implement performant and reliable execution of distributed software.

In this chapter, we learned about the architecture and usage of Python packages, such as Dask and PySpark, which provide powerful APIs to design and execute programs capable of scaling to hundreds of machines. We also briefly looked at MPI, a library that has been used for decades to distribute work on supercomputers designed for academic research.

Throughout this book, we explored several techniques to improve the performance of our program, and to increase the speed of our programs and the size of the datasets we are able to process. In the next chapter, we will describe the strategies and best practices...