My program works perfectly in emulation mode, but as soon as I run it on the GPU is get “unspecified launch error”. Unfortunately I am not in a position to use CUDA-GDB. I have tried with different threads and block counts but have had no joy. Can anyone explain how best to go about finding out what is wrong.
Here is the relevant part of the .cubin file
lmem = 1200 smem = 84 reg = 13