GPUDirect RDMA with NVIDIA A100 and A40

leesk212 · September 13, 2022, 6:03am

hello,

I am in the process of building a gpu clustering server using A100 and A40.

And, I am in the process of building using the nv_peer_mem,ucx,gdr_copy modules.

Through this link, I also found out that the A100 series also supports GPURDMA.

So, I have the following questions:

Is it possible to utilize the A40 to support GPUDirect RDMA?
If possible, will GPUDirect RDMA between A100 and A40 work in Server1(A100), Server2(A40) environment?
The applicable NVIDIA GPUDirect RDMA whitepaper* confirms that the supported GPUs are NVIDIA Tesla series. Is it true that the white paper hasn’t been updated yet? (*https://docs.nvidia.com/networking/display/GPUDirectRDMAv18/System+Requirements+and+Recommendations)

Topic		Replies	Views
GPUdirect RDMA with NVIDIA A100 for PCIe DGX Systems (Data Center) cuda , a100 , rdma-and-roce	1	2465	June 17, 2022
GPUDirect RDMA CUDA Programming and Performance	2	990	December 1, 2021
RTX A4000 support GPUDirect RDMA? CUDA Programming and Performance	6	2464	June 25, 2024
Does Tesla A100 support GPUDirect? CUDA Programming and Performance	2	612	September 5, 2021
How to know GPUDirect RDMA supported Hardware? CUDA Programming and Performance	1	769	November 22, 2021
GPUDirect RDMA with Geforce GTX 10xx? CUDA Programming and Performance	3	4445	December 13, 2017
Which GPUs Support GPU Direct RDMA? CUDA Programming and Performance	0	596	June 28, 2020
Does support for "GPUDirect for Video" mean that RDMA in general is supported? GPU - Hardware	5	2306	October 27, 2021
GPUDirectRDMA enabled GPUs CUDA Programming and Performance	8	3933	November 8, 2019
Clarification on requirements for GPUDirect RDMA CUDA Programming and Performance	16	4889	November 7, 2023