I’m trying GPUDirect RDMA technology to send some data in GPU memory to a remote host bypassing the GPU server’s CPU.
When I register the GPU memory to the RDMA protection domain with empty access flag, the send/recv operations all succeed without reporting any error, but the data received in remote host are just a bunch of zeros. When I change the GPU side mr access flag to
IBV_ACCESS_LOCAL_WRITE, the remote host can receive the correct data.
From my perspective
IBV_ACCESS_LOCAL_WRITE is not required on the GPU side because the RDMA HCA only reads the data in that region. What’s the problem here?
System environment: I am using Nvidia A100 with CUDA version 12.1 on the GPU side and ConnectX-6 Infiniband cards on both sides.