Basic concepts - Kernel, grids, blocks and threads

Hi community,

I’m new to CUDA programming, and I would like to know about some very basic concepts.

In stream processing model of programming, we define kernels that are algorithms that will run in parallel (described in wikipedia:Stream processing).

Ok, that is also applied in Cg with the shaders, because when we declare a pixel shader it will create N kernels over the N pixels of the image being processed.

In CUDA, i see that kernels have grids, grids have blocks and blocks have threads. For what I understood from the CUDA Programming Guide that when we call a kernel from CUDA, it wont execute N kernels, it will have N blocks(I don’t know if these should be threads, correct me if I’m wrong please) executing in parallel.

Also, threads aren’t necessarily all going to execute the same code (or will they?). If they are supposed to execute the same code (know they can branch with if’s), then the threads are supposed to be the kernels from the Stream processing, right? If that’s true, then blocks are just an optimization/feature for less capable of paralellism devices.

And one last question, can a kernel (CUDA kernel) have multiple grids?

Thanks in advance.
Edison Gustavo Muenz