Book Image

Learn CUDA Programming

By : Jaegeun Han, Bharatkumar Sharma
Book Image

Learn CUDA Programming

By: Jaegeun Han, Bharatkumar Sharma

Overview of this book

<p>Compute Unified Device Architecture (CUDA) is NVIDIA's GPU computing platform and application programming interface. It's designed to work with programming languages such as C, C++, and Python. With CUDA, you can leverage a GPU's parallel computing power for a range of high-performance computing applications in the fields of science, healthcare, and deep learning. </p><p> </p><p>Learn CUDA Programming will help you learn GPU parallel programming and understand its modern applications. In this book, you'll discover CUDA programming approaches for modern GPU architectures. You'll not only be guided through GPU features, tools, and APIs, you'll also learn how to analyze performance with sample parallel programming algorithms. This book will help you optimize the performance of your apps by giving insights into CUDA programming platforms with various libraries, compiler directives (OpenACC), and other languages. As you progress, you'll learn how additional computing power can be generated using multiple GPUs in a box or in multiple boxes. Finally, you'll explore how CUDA accelerates deep learning algorithms, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs). </p><p> </p><p>By the end of this CUDA book, you'll be equipped with the skills you need to integrate the power of GPU computing in your applications.</p>
Table of Contents (18 chapters)
Title Page
Dedication

Read-only data/cache

As you may have already guessed based on the memory name, a read-only cache is suitable for storing data that is read-only and does not change during the course of kernel execution. The cache is optimized for this purpose and, based on the GPU architecture, frees up and reduces the load on the other cache, resulting in better performance. In this section, we will provide details on how to make use of a read-only cache with the help of an image processing code sample that does image resizing. 

Read-only data is visible to all of the threads in the grid in a GPU. The data is marked as read-only for the GPU, which means any changes to this data will result in unspecified behavior in the kernel. CPU, on the other hand, has both read and write access to this data. 

 Traditionally, this cache is also referred to as the texture cache. While the user...