Things were going too well :)
I now have a failure with cuMemcpyDtoH.
It is failing with a CUDA_ERROR_INVALID_VALUE.
CUdeviceptr gpu_data = (CUdeviceptr)NULL; CU_SAFE_CALL( cuMemAlloc( &gpu_data, mem_size));
//Run the kernel here without any problem.
CU_SAFE_CALL( cuCtxSynchronize() );
VxrUByte *cuda_output=NULL; cuMemAllocHost((void **)&cuda_output,mem_size); err = cuMemcpyDtoH((void *)cuda_output, gpu_data, mem_size );
The mem_size is around 25MBytes.
I have also tried with a normal malloc or just copying 64 bytes and it fails in the same way.
I think we could do with more error messages. It would be good to know which value is wrong here…