As discussed in Chapter 3, CUDA Thread Programming, CUDA provides cooperative groups. Cooperative groups can be categorized by their grouping targets: warp-level, block-level, and grid-level groups. This recipe covers grid-level cooperative groups, and looks at how cooperative groups handle the CUDA grid.
The most prominent benefit of the cooperative group is the explicit synchronization of the target parallel object. Using the cooperative group, the programmer can design their application to synchronize CUDA parallel objects, thread blocks, or grids explicitly. Using the block-level cooperative group covered in Chapter 3, CUDA Thread Programming, we can write more readable code by specifying which CUDA threads or blocks need to synchronize.