Hello, i used the cuda low-level virtual memory management api to allocate a device memory, does the allocated memory support the gpu-direct operation?
The underlying device memory should be equivalent to memory allocated with e.g.
cudaMalloc. Therefore, such memory should be usable with e.g. P2P, GPUDirectRDMA, and CUDA IPC.
I used the virtual mempry management api, cuMemCreate/cuMemMap/cuMemsetAccess to allocated two memory on GPU0 and GPU1, then set GPU0’s peer as GPU1, and then call cudaMemset the memory on GPU1, then will return the error cudaErrorInvalidValue. If i set GPU1’s peer as GPU0, the same error occurs. But if i used cudaMalloc to allocate the memory, the other operations as above, then it works well.
Did you use
*enablePeerAccess ? For low level virtual memory allocation, peer access rights need to be set with
yes, cudaDeviceEnablePeerAccess is called before i used the cudaMemset on the peer device. And the access flags for cuMemSetAccess is CU_MEM_ACCESS_FLAGS_PROT_READ or CU_MEM_ACCESS_FLAGS_PROT_READWRITE, the other flags is invalid.
The governing blog is here and there is a provided sample code. You might want to study those and verify the sample code behavior in your setup. Then compare the differences between what you are doing and what the sample code is doing.
Thanks a lot. I will learn from the blog and sample.
Thanks a lot. Your suggestion is useful, i changed the parameter of cuMemSetAccess, and add the access rights for other GPUs, then the P2P access is ok.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.