I encountered a problem with nvidia_p2p_get_pages when the input VA is from managed memory (returned by cudaMallocaManaged). I found that nvidia_p2p_get_pages keeps returning -EINVAL. I also tried the same code with the VA returned by cudaMalloc and it worked fine. Is this behavior intentional? If not, is there a way to make cudaMallocManaged work with nvidia_p2p_get_pages?
- I am using CUDA V8.0.61 with the latest driver on Centos 7.
- The GPU I am using is Pascal P100.
- I put both p2p_token and va_space parameters as 0 since they are deprecated.
- I have read the GPUDirect RDMA document (http://docs.nvidia.com/cuda/gpudirect-rdma/index.html) and it seems that nvidia_p2p_get_pages should work with managed memory (although not recommended). I quote the relevant part from the document below.
CUDA Unified Memory is not explicitly supported in combination with GPUDirect RDMA. While the page table returned by nvidia_p2p_get_pages() is valid for managed memory buffers and provides a mapping of GPU memory at any given moment in time, the GPU device copy of that memory may be incoherent with the writable copy of the page which is not on the GPU. Using the page table in this circumstance may result in accessing stale data, or data loss, because of a DMA write access to device memory that is subsequently overwritten by the Unified Memory run-time. cuPointerGetAttribute() may be used to determine if an address is being managed by the Unified Memory runtime.