Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying GPU Programming with C++ and CUDA
  • Table Of Contents Toc
GPU Programming with C++ and CUDA

GPU Programming with C++ and CUDA

By : Paulo Motta
close
close
GPU Programming with C++ and CUDA

GPU Programming with C++ and CUDA

By: Paulo Motta

Overview of this book

Written by Paulo Motta, a senior researcher with decades of experience, this comprehensive GPU programming book is an essential guide for leveraging the power of parallelism to accelerate your computations. The first section introduces the concept of parallelism and provides practical advice on how to think about and utilize it effectively. Starting with a basic GPU program, you then gain hands-on experience in managing the device. This foundational knowledge is then expanded by parallelizing the program to illustrate how GPUs enhance performance. The second section explores GPU architecture and implementation strategies for parallel algorithms, and offers practical insights into optimizing resource usage for efficient execution. In the final section, you will explore advanced topics such as utilizing CUDA streams. You will also learn how to package and distribute GPU-accelerated libraries for the Python ecosystem, extending the reach and impact of your work. Combining expert insight with real-world problem solving, this book is a valuable resource for developers and researchers aiming to harness the full potential of GPU computing. The blend of theoretical foundations, practical programming techniques, and advanced optimization strategies it offers is sure to help you succeed in the fast-evolving field of GPU programming.
Table of Contents (17 chapters)
close
close
Lock Free Chapter
1
Understanding Where We Are Heading
6
Bring It On!
10
Moving Forward
15
Other Books You May Enjoy
16
Index

Analyzing performance

We’ve now seen two ways to make our GPU code available to Python. It is clear that ctypes is very straightforward, despite that awkward way of defining the functions that will be used. Creating an extension, on the other hand, offers a very clear interface to the end user even though it is a little more laborious.

However, it is not only style that counts here; it is also clear that our extension implementation that did not use numpy arrays involved extensive data copying. The question is: how much does that affect the overall performance?

Figure 9.2: Execution time for each type of Python integration

Quick tip: Need to see a high-resolution version of this image? Open this book in the next-gen Packt Reader or view it in the PDF/ePub copy.

The next-gen Packt Reader and a free PDF/ePub copy of this book are included with your purchase. Scan the QR code OR visit packtpub.com/unlock, then use the search bar to find this book...

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
GPU Programming with C++ and CUDA
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon