According to the documentation of GPUdirect in Jetson AGX Xavier (GPUDirect RDMA on NVIDIA Jetson AGX Xavier | NVIDIA Developer Blog), the gpu-memory should be allocated with cudaHostAlloc().
This API allocates host memory that is page-locked and accessible to the device, but basically its a host memory and not device memory.
Now I am a little confused about this mechanism.
If the pointer provided to the communication library is a host-memory-pointer, what role does the Nvidia-kernel-driver play (if any) ?
Actually, I observed data-transfer to the buffer allocated with cuMemAllocHost even when I disabled the nv_peer_mem driver. Its make me wonder about the difference between GPUDirect-RDMA and regular DMA in the Xavier.
If the buffer used for the DMA transfer is not GPU-buffer, isn’t it just regular DMA?
In order to transfer data from thirs-party device to the Xavier for processing, I used the GPUdirect mechanism to transfer the data to buffer allocated with cuMemAllocHost(). Is this equivalent to allocating regular host memory (using malloc), call regular and non-GPUDirect DMA transfer, and then use cudaHostRegister to make the data accessible to the device?
Appreciate your time,