-
Book Overview & Buying
-
Table Of Contents
GPU-Accelerated Computing with Python 3 and CUDA
By :
This chapter introduced CuPy as the GPU-equivalent of NumPy and SciPy, enabling high-level yet performant code for n-dimensional array operations. Effective use of CuPy demands vectorized thinking, much like NumPy.
As a library rather than a framework, CuPy integrates seamlessly with NumPy and Numba-CUDA, offering flexibility in code design. For standard operations such as matmul, CuPy's optimizations make it the preferred choice. However, for custom operations that don't fit the ufunc model, Numba-CUDA kernels can deliver superior performance. For small-scale problems, NumPy outperforms CuPy due to data transfer overheads and low device occupancy.
CuPy's high-level abstraction does not negate the importance of GPU fundamentals such as memory layout and data access patterns. The chapter concluded with practical performance tips.
In the next chapter, the focus shifts to GPU-accelerated dataframes and machine learning using RAPIDS.