cuda function call fails when compile in release mode

hello everyone,

i’m writing a cuda code that runs perfectly if compile as debug mode (visual studio 2008). but now i need to benchmark the performance of my code, so i want to compile it as a release mode.

however, the cuda function cuMemcpyDtoH fails giving the error code : CUDA_ERROR_UNKNOWN

at a first guess, i thought this may caused by the windows’ watchdog timeout detection. because the release mode runs faster and the display driver may be busier, so the operating system (windows vista) may kill the driver. but after i set the registry key to turn off the timeout detection, the problem still existed.

so can someone give me a hint what the problem may be?

what does CUDA_ERROR_UNKNOWN mean?