Book Image

Julia 1.0 High Performance - Second Edition

By : Avik Sengupta
Book Image

Julia 1.0 High Performance - Second Edition

By: Avik Sengupta

Overview of this book

Julia is a high-level, high-performance dynamic programming language for numerical computing. If you want to understand how to avoid bottlenecks and design your programs for the highest possible performance, then this book is for you. The book starts with how Julia uses type information to achieve its performance goals, and how to use multiple dispatches to help the compiler emit high-performance machine code. After that, you will learn how to analyze Julia programs and identify issues with time and memory consumption. We teach you how to use Julia's typing facilities accurately to write high-performance code and describe how the Julia compiler uses type information to create fast machine code. Moving ahead, you'll master design constraints and learn how to use the power of the GPU in your Julia code and compile Julia code directly to the GPU. Then, you'll learn how tasks and asynchronous IO help you create responsive programs and how to use shared memory multithreading in Julia. Toward the end, you will get a flavor of Julia's distributed computing capabilities and how to run Julia programs on a large distributed cluster. By the end of this book, you will have the ability to build large-scale, high-performance Julia applications, design systems with a focus on speed, and improve the performance of existing programs.
Table of Contents (19 chapters)
Title Page
Dedication
Foreword
Licences

SIMD parallelization (AVX2, AVX512)

Single Instruction, Multiple Data (SIMD) is method of parallelizing computation within the CPU, whereby a single operation is performed on many data elements simultaneously. Modern CPU architectures contain instruction sets that can do this, operating on many variables at once.

On Intel processors, these types of instructions have been progressively implemented using names such as SSE, AVX2, and AVX512. Each of these implementations add on extra functionality, but also allow operations on wider data. SIMD was first implemented in older Intel processors, with the name SSE, which went through multiple versions. Most Intel and AMD processors from the last decade implement the AVX2 instruction set, which provides 256 bits of parallelism. More recent processors have an upgraded instruction set called AVX512, which, as the name suggests, can operate...