I am new to CUDA and not really an expert in Computing either. I was reading the programming guide and a couldn't figure out something. Could someone help me with this?
From the programming point of view, threads in CUDA are organised into Blocks and then into Grids; while from a Hardware point of view, Threads are managed as groups of 32, called warps. And when a block is given to a Multiprocessor to execute, the Multiprocessor splits the Block into Warps in order to manage them.
So my question is, what happens when the no of threads in a Block is not a multiple of 32. Say a certain block has 50 threads. Would it be split into two warps one with 32 threads and the other warp with only18 threads? Or does the second warp take up another 14 threads from the next block…?
Sorry if its a silly question. Cut some slack for a newbie please…