Cudamemcpy error cudaErrorLaunchTimeout


I am doing some matrix computations using CUDA. I follow the usual sequence of

  • Copy data to device

  • Do computations on device

  • Copy data from device to host

The third step gives me error, so I cannot copy data from device back to host.

This happens only for matrix of larger size, i.e. 5000x5000. For 500x500 and 1000x1000 sizes, there is no problem.

The error code returned is 6, which is cudaErrorLaunchTimeout. The description in my driver_types.h file is as follows:

I placed a cudaGetLastError() statement right after the kernel launch and it returned cudaSuccess.

Immediately after this, there is the Cudamemcpy() statement which returns the above error.

Any idea what is happening? There is some response on this thread - But I don’t have any monitor connected to device.

Any help will be appreciated.

You need to disable display driver timeout

I think I got it. The device was actually running an X11 session, and the runtime was exceeding 5 seconds. That might have been the problem since when I executed on a device without X11, it ran fine.