-
Book Overview & Buying
-
Table Of Contents
GPU-Accelerated Computing with Python 3 and CUDA
By :
To illustrate the runtime performance of our GPU implementation of an MD simulator, we ran it for different system sizes – 100, 1,000, and 10,000 atoms – and compared it to two CPU implementations:
prange for multithreadingThe GPU version was run on A100, a data center GPU, and additionally RTX Ti 2080, a consumer GPU. The following figure shows the elapsed runtime. Note that the y axis is logarithmically scaled, and precision is in float64.

Figure 13.4: Our MD simulation benchmark runs on a different system for 1,000 time steps
For small systems, GPU computing offers few benefits due to its overhead and GPU underutilization. For large systems, the performance gains with the GPU are significant. The RTX GPU achieved up to 150x faster than the serial code, and the A100 GPU was nearly 400x faster. The significant...