CUDA Unified Memory concurrent access

I’m developing a multithreaded application utilizing CUDA unified memory. The Xavier apparently supports Compute Capability 7.2, which is the minimum requirement for concurrent access to unified memory according to the CUDA docs. However, I’m gettings segfaults just as described by this section in the docs:

“It is not permitted for the CPU to access any managed allocations or variables while the GPU is active for devices with concurrentManagedAccess property set to 0. On these systems concurrent CPU/GPU accesses, even to different managed memory allocations, will cause a segmentation fault because the page is considered inaccessible to the CPU”

cudaGetDeviceProperties with concurrentManagedAccess property reports 0, which means there’s no support on the Xavier. Is that expected behavior?

# R32 (release), REVISION: 3.1, GCID: 18186506, BOARD: t186ref, EABI: aarch64, DATE: Tue Dec 10 07:03:07 UTC 2019

Hi @mosshammer.a, yes that is expected behavior on Xavier, it does not support concurrentManagedAccess. Here was a similar topic regarding this: Unified memory concurrent access - #3 by AastaLLL

You could try using cudaHostAlloc()/cudaHostGetDevicePointer() as opposed to cudaMallocManaged()

1 Like

Thanks for the info. Although the latency in my application increased slightly, it is working fine now.
Do you know by any chance if the concurrent access is restricted by software only which might be resolved in future versions or is this a hardware restriction?

Could you try to use cudaGetDeviceProperties?
https://docs.nvidia.com/cuda/cuda-runtime-api/structcudaDeviceProp.html#structcudaDeviceProp_116f9619ccc85e93bc456b8c69c80e78b