cudaHostRegister returns cudaErrorNotSupported on Windows 10 with Quadro P2200

Hi all,

I wrote a CUDA program which was working fine on Windows 10 with GTX 1650 super, but after I replaced 1650 super with Quadro P2200, it got cudaErrorNotSupported when calling cudaHostRegister. I checked the device properties by calling cudaGetDeviceProperties, and the value of canMapHostMemory is 1.

I also tried running my program on Linux with P2200, and it was also working fine.
Is it the limitation of P2200 on Windows? Any one can help please?

Thanks,
Henry

I don’t think there is any such limitation as you are asking about. I just ran the following code:

int* d = (int *)malloc(1048576);
cudaError_t err = cudaHostRegister(d, 1048576, cudaHostRegisterMapped);
printf("%s\n", cudaGetErrorString(err));

on a Quadro P4000 in WDDM mode on windows with CUDA 11.1, driver 456.81, and the output was “no error”. VS2019 x64 Debug project.

Running Robert Crovella’s code on Windows 10, MSVC 2019 Release, CUDA 11.1, driver 462.31, Quadro P2000 also prints no error for me.

Thanks, Robert and njuffa. Robert’s code works for me, too… However, my program still failed with the same error code.

Anyway, thanks for your help. Let me know if you can think of any possible causes to this weird thing.

Henry

Most CUDA runtime API calls can report errors from previous asynchronous activity. It’s not clear that this would fit that description, however.

After repeatedly bisecting and testing, it turned out that the atomicAdd_system() function caused the issue… don’t understand why, though… :(