Change of device causes "Unspecified launch failure"


I am developing a CUDA program on a workstation equipped with a Tesla C2070.
Everything works out fine here on the development machine.

However, when I move the exe file to another machine with 2 x Tesla M2090, one of the kernel starts to complain with “Unspecified launch failure”. Some other kernels that run before this one does not seem to complain. What should I look into?

Thanks for your advice.

Fermi cards have much improved memory protection that older cards. The most likely culprit is out of bounds shared memory access. Code which seems to run “fine” on a GT200 can often cause a segfault (which is what the launch failure really is) on a Fermi card. Try using the cuda-memcheck utility or turn on cuda-memcheck inside cuda-gdb and see what it reports.