i’m a cuda beginner. I haven’t got the relation between device and grid(s). If each grid uses own shared memory, and communicates with each other through the global memory, is it necessary to partition a device into many grids? (maybe anyone knows an example?)
In other words, is the relation between device and grid always 1 to 1?