Kernel runs fine on OSX/Linux, crashes on Windows.

Hello all,

I have a program that requires repeatedly launching the same kernel and accumulating the results over time (a fairly typical Monte-Carlo type simulation). On OSX and Linux, everything runs fine and the kernel will successfully execute as many times as specified. On Windows, however, the kernel will successfully launch anywhere between once and a few dozen times before the program will crash with an access violation. The number of successful launches before a crash seems to be random.

Does anyone know of some possible reasons why this may happen?


Is there some hardware difference between the two computers running linux and windows? Is the card on windows used as well for the operating system?

Yes, the hardware being used in all three cases is identical. I have a machine triple booting OSX/Fedora/Windows 8.

“access violation” seems to hint at a problem in the host code rather than a failing kernel launch. Possible reasons could be failing memory allocation, out of bounds access, uninitialized data, race condition. Make sure the return status of all CUDA API calls is checked. Run the app with valgrind (or an equivalent tool on Windows).

Should I have misinterpreted the description and the problem is really a failing device kernel, run the app with cuda-memcheck, and also check kernel execution status carefully, there might be a timeout that occurs only on Windows, due to different time limits applied by the operating system’s watchdog timer.

Sounds like it could be fixed by disabling WDDM TDR in WIndows. If you have NSight Installed, there is an option from within Nsight Monitor to disable it. Otherwise, just disable it from the registry by either running this registry file, or navigating manually and adding/changing the TdrLevel value: