-
Book Overview & Buying
-
Table Of Contents
GPU-Accelerated Computing with Python 3 and CUDA
By :
Numba-CUDA supports multi-GPU programming by providing low-level APIs. Unlike JAX or CuPy, we must explicitly handle device context, memory allocation, data transfer, and kernel launches for each GPU. This offers fine-grained control over GPU processing and makes it easier to implement non-standard parallelism patterns.
Some generic code for targeting a specific GPU with Numba-CUDA looks like this:
from numba import cuda
with cuda.gpus[gpu_id]:
data_device = cuda.to_device(data)
Here, numba.cuda.gpus returns a list of all the available CUDA-capable GPUs on the system detected by Numba. cuda.gpus[gpu_id] accesses a particular GPU by its ID (e.g., 0 for the first GPU, 1 for the second, etc.). The with statement creates a context in which the selected GPU becomes the active device for all CUDA operations inside the block. Inside this context, cuda.to_device(data) transfers the data from the CPU to the memory of the currently active GPU. This ensures that...