GPUDirect RDMA with NVIDIA A100 and A40


I am in the process of building a gpu clustering server using A100 and A40.

And, I am in the process of building using the nv_peer_mem,ucx,gdr_copy modules.

Through this link, I also found out that the A100 series also supports GPURDMA.

So, I have the following questions:

  1. Is it possible to utilize the A40 to support GPUDirect RDMA?
  2. If possible, will GPUDirect RDMA between A100 and A40 work in Server1(A100), Server2(A40) environment?
  3. The applicable NVIDIA GPUDirect RDMA whitepaper* confirms that the supported GPUs are NVIDIA Tesla series. Is it true that the white paper hasn’t been updated yet? (*