Problem while cracking MD5 It seems that a GPU thread can not continually run a little long

I had a problem while using CUDA to crack MD5. It seems that a GPU thread can not continually run a little long when I do some modification from Juric’s cudamd5 program ( for my own need. Instead of using dictionary, I dynamically exhaust each password. Since the amount is huge, I want to calc millions of MD5 in a loop in each GPU thread. But I find that the program can crash if any GPU thread has calculated too much times or has run too long time, even only one GPU thread is running. In my computer, it is about 300 thousand times or 3 second to trigger the crash. The problem can reappear by calling RSA_KERNEL function millions of times in a loop. And I find that when GPU is busy, the CPU rate of the calling thread is still high. I can not explain these phenomenons. Should I upload the source code? My display card is NVIDIA GeForce 9600 GT, and my OS is Windows 7.
Thanks for all.

Reading the CUDA documentation helps. You need a second graphics card to run the display on. That allows you to run the CUDA card for computation without running into the watchdog timer.

As mentioned there is a timer to prevent the GPU from locking up the display. It can be set via the registry in Windows but this is only a good solution if your kernel is just running a little too long. Another option is to break your computation in smaller chunks and launch the kernel multiple times.

Thank you, cbuchner1
Thank you, jgoffeney