Hi,
Is it possible to ask NVCC to utilise unused shared memory as registers?
Thanks,
Paul
Hi,
Is it possible to ask NVCC to utilise unused shared memory as registers?
Thanks,
Paul
No. You have to force the variable to shared mem in your code.
nvcc handles register spills always by putting the variable in local memory.
Peter
Thanks Peter.
Is it the same when too much shared memory is requested? If I crank up some shared variable sizes, a kernel still executes. But if I make Ns, the third kernel execution configuration parameter too large, it doesn’t; cudaGetLastError returns cudaErrorLaunchOutOfResources.
Paul
Shared memory is max 16k per block. That goes for static as dynamic shared mem. So static+dynamic < 16k. If you augment the shared mem requirements of your code from say a few bytes to several kbytes, the runtime will first schedule fewer and fewer blocks on each multiprocessor until it runs out of memory at 16k. Then you get a launch failure. Check the occupancy calculator how the shared mem size influences the multiprocessor block scheduling.
Peter