I’ve been working on some P2P DMA code and when everything is running in a single process it works fine. However, I need to have another process coordinate the DMAs. I can successfully pass my device pointer to another process using cudaIpcGetMemHandle/cudaIpcOpenMemHandle, but when I pass that to my kernel module to map the memory, nvidia_p2p_get_pages fails with -EINVAL. The documentation seems to hint that you need to set the (deprecated) va_space and p2ptoken parameters when dealing with another process’s memory, but the p2ptoken is always zero in the source process when I read it. Any suggestions? I don’t want to call get_pages in the source process because I don’t want to trust raw page addresses that come from the client process.
Maybe I am misunderstanding the desired functionality, but it seems to me that if implemented as described it would create a security hole that would delight computer virus creators looking for another vector to spread their malware.
The intent is that passing a virtual address is less risky because we can validate that address with CUDA/nvidia kernel module that the virtual address represents an actual allocation. This should be much less risky that accepting physical page addresses from the client process.
Obviously there will also be some authentication between the client process and the orchestrator process to help secure it.
At any rate, the question stands: how do we call nvidia_p2p_get_pages with a pointer obtained by another process’s cudaMalloc()?
If my initial assessment of a security risk is correct, one could reasonably conclude that the behavior you observe is the rational consequence of NVIDIA consciously eliminating exposure to that risk.
Looking forward to an authoritative response by an NVIDIA engineer.