Hi. There is only one CUDA sample illustrating grid level synchronisation using cooperative groups:grid_group on multi gpus: conjugateGradientMultiDeviceCG. As far as I understand, this is for multiple gpus on a single node. On the other hand, it mentions multi_grid_group setting in the CUDA documentation. So, does it for multiple gpus on multiple nodes? OR we are still restricted with single node?
cooperative groups are restricted to a single node (at this time).