cuMemcpyDtoH Failing.

Things were going too well :)

I now have a failure with cuMemcpyDtoH.
It is failing with a CUDA_ERROR_INVALID_VALUE.

CUdeviceptr gpu_data = (CUdeviceptr)NULL;
CU_SAFE_CALL( cuMemAlloc( &gpu_data, mem_size));

//Run the kernel here without any problem.

CU_SAFE_CALL( cuCtxSynchronize() );

VxrUByte *cuda_output=NULL;
cuMemAllocHost((void **)&cuda_output,mem_size);
err = cuMemcpyDtoH((void *)cuda_output, gpu_data, mem_size ); 

Failure here

The mem_size is around 25MBytes.
I have also tried with a normal malloc or just copying 64 bytes and it fails in the same way.

I think we could do with more error messages. It would be good to know which value is wrong here…

Any ideas?

XP64-GF8800X-Cuda 2.0b