as i just read, kernel launches are asynchronous.
Image the following:
float *A, A_dev; // allocation and so on // ... kernel<<<1, 1>>>(A_dev); HANDLE_ERROR(cudaMemcpy(A, A_dev, byteInA, cudaMemcpyDeviceToHost));
The kernel execution time is quite high, like 2 seconds. What would you expect if i call cudaMemcpy() immediately after the kernel launch, although the kernel is still running?
Could this be causing the graphics driver to crash?
What i do get in the HANDLE_ERROR function is a cudaErrorUnknown if i do the described. Or could the cudaErrorUnknown be something else, too?
I am using CUDA SDK 3.2.
Best regards and thanks, tdhd