cudaErrorLaunchTimeout error - how to repair after it happens ?

Hi,

I have a station with two gtx 480 cards, my software utilizes both GPUs from separate threads. One of these cards is connected to the monitor so the watchdog is active for it (under Xp64 that I use it is about 5-10 seconds).

Sometimes, my kernel triggers cudaErrorLaunchTimeout on the card with attached monitor. What I want to do is just to skip the data set on which my kernel runs for too long and keep working with another data set, however, any call to any CUDA function results in cudaErrorLaunchTimeout as soon as the kernel has timed out once, it is necessary to restart the whole application in order to reset the state of CUDA.

How is it possible to reset CUDA without whole process termination ?

Hi,

I have a station with two gtx 480 cards, my software utilizes both GPUs from separate threads. One of these cards is connected to the monitor so the watchdog is active for it (under Xp64 that I use it is about 5-10 seconds).

Sometimes, my kernel triggers cudaErrorLaunchTimeout on the card with attached monitor. What I want to do is just to skip the data set on which my kernel runs for too long and keep working with another data set, however, any call to any CUDA function results in cudaErrorLaunchTimeout as soon as the kernel has timed out once, it is necessary to restart the whole application in order to reset the state of CUDA.

How is it possible to reset CUDA without whole process termination ?