CUDA Vista "Display driver has stopped responding" CUDA execution time on Vista

I’m using NVIDIA Quadro FX 1700 on a Vista 32 Dell machine. The driver I’m using is 177.84 and the CUDA version is 2.0. The machine has a Xeon processor and 3GB RAM.

When I run my application in the EmuDebug mode it runs fine and it completes successfully. Trying to run the application in the release mode it crashes. Sometimes the blue screen is shown, sometimes the screen start to pixelate and sometimes only the error message appears “Display driver has stopped responding and has recovered”.

The kernel is basically doing numerical integration and Eigen value computations for every voxel in a volume with dimensions 256x160x107. Every block has 160 threads and hence there is 256x107 blocks in the grid. When reducing the grid size to a small size 5x5 the application completes successfully although the block size is still the same 160. Also if I change a condition inside the code so that each thread takes less time to execute the application completes successfully. In both tests I do not change the allocated memory so I don’t suspect the memory requirements are a reason for the crash.

I noticed that the computations crash whenever the time of execution exceed an amount of around 5 sec. I used a timer in the code to check out the computation time. I suspect that the crash is happening because the device takes long to execute the kernels which makes Vista for some reason think it is not responding. When the system detects that the device is not responding it automatically restart the device and hence the device memory is cleared.

My question: Is there any limitations on the amount of time the device needs to complete the execution of the kernels on Windows Vista?

I have attached the code.
template_kernel.txt (25 KB)
template.txt (9.48 KB)

Ever heard of the watchdog timer? This is an auto check by the OS to test if your card is still responding and automatically resets the card after not returning from a kernel within 5 seconds, thus leaving it sometimes in undefined behaviour. Try searching the forum for it. It is often mentioned and AFAIK under Windows you can do nothing about it except split your kernel into smaller parts or use another device as primary graphics device and the quadro only for computation purposes.