Having dedicated exception checks and checking errors is one of the base features that make high-quality software. CUDA functions report the error by returning their status for each function call. Not only CUDA APIs, but kernel functions and the CUDA library's API call follow this rule. Therefore, detecting a recurring error is the start of identifying errors in CUDA execution. For example, let's assume that we have allocated global memory using the cudaMalloc() function, as follows:
cudaMalloc((void**)&ptr, byte_size);
What if the global memory has insufficient free space to allocate new memory space? In this case, the cudaMalloc() function returns an error to report an out of memory exception. Errors that are triggered by kernel calls can be captured from the flags using cudaGetLastError(). This returns...