When launching my CUDA kernel, I get an invalid configuration argument error. I’m using 256 threads per block and 10 blocks per grid, so no where near the limit, but I allocate 143360 bytes of shared memory as the third argument in the kernel invocation. Is there some limit to this value that I havent found that could be causing this error message?
Shared memory is limited to 16Kb.
OK, I’m guesing limit of 16000 bytes due to 16KB of shared memory per processor?
Yes, though it is actually 2**14 (=16384) bytes of shared memory, but up to 256 bytes can be reserved to hold kernel parameters.