cuMemcpyDtoH generates unkonwn_error and lunch_timeout error

hello guys,

i have a kernel code written in cuda, and i loop the kernel for many times. it runs ok for the first say few hundred times and then the cuda function call cuMemcpyDtoH. for most of the time, the error code is CUDA_UNKNOWN_ERROR, but sometime it is CUDA_launch_time_out. and meanwhile, i can see the task bar of my operating system, which is vista, popup a not saying that the display driver was not responding and has been recovered.

the problem is not predictable. for example, if my program failed at the 500 loop step. however sometimes it can execute more than 500 steps.

the data i’m trying to read back from the gpu is 23mb in size. is this problem caused by oversized data? how can i fix it?

is this caused by the operating system? cause the system thinks that the display driver doesn’t respond.


i kinda solved the problem by splitting my data into chunks and execute several times. the program now becomes slow. but runs fine.

23 mb is not over sized … u mean 23 mb * no loop steps ? … what card are you using ?