“Grid” is that delicious stuff they eat for breakfast in Alabama, US … Oh no, wait, that is Grit …
But, seriously, from the little bit that I know about CUDA, so far, a “Grid” along with the Block size determines the number of threads that your kernel will run. The Grid concept is a way to have more threads in a kernel than the limited number possible inside one Block.
From my current understanding of the architecture, the Block size is determined by how much sharing of data and synchronization your algorithm needs. The Shared memory area built into each SM provides for the needs of a block of threads. After you select an appropriate size for the Block of threads, the size of the Grid (number of Blocks), is determined by how much data your kernel needs to process.
just have a look at the course you’ve already referenced before.
It’s a really good introduction (and beyond) to CUDA.
The Programming Guide from the SDK is also a good starting point.