How to use GPU Direct RDMA with infiniband ConnectX-4?

gehuang38 · January 14, 2024, 4:52am

I am having trouble setting up GPU Direct on the local machines. Here’s the the local software and hardware:

GPU Tesla P100-SXM2
Adaptor（MLNX）
5e:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
5e:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
Cuda compilation tools, release 10.1, V10.1.243
Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-167-generic x86_64)

I tested the RDMA connection by using ibping, and it works fine.

--- anton-j0.(none) (Lid 2) ibping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 1030 ms
rtt min/avg/max = 0.005/0.103/900.020 ms

However, when I was trying to get GPU Direct RDMA to run, nv_peer_mem wouldn’t install. And as the github demo indicated, it requires ConnectX 5+ to work.

I tried to find other ways that is compatible with ConnectX 4 but hasn’t found anything useful yet. I checked the forum and someone got ConnectX 3 pro to work on GPU Direct RDMA. Could someone give me some guidelines to get GPU Direct RDMA working on ConnectX 4?

ipavis · February 14, 2024, 7:14am

Hello and thank you for writing us.
This issue can be happening due to some issues.
GPU Direct RDMA is supported with any NVIDIA ConnectX-4 (or later) InfiniBand adapter card. This means that your ConnectX-4 adapters should be compatible with GPU Direct RDMA.
I would like to advice on opening a case in our Enterprise service Portal
This will allow us to dig dipper in to the Issue and help.

Thanks and have a great day!
Ilan.

Topic		Replies	Views
GPUDirect RDMA on ConnectX-2? RDMA Software For GPU	1	577	November 7, 2014
Can connectx-4 455a support GPUDirect? Ethernet Adapter Cards	2	29	November 28, 2024
GPUDirect RDMA on Connectx-7 RDMA Software For GPU	1	113	October 31, 2024
GPUDirect RDMA support with CUDA 5 CUDA Programming and Performance	19	9171	May 28, 2013
GPUDirect seemingly failing along PIX routes but not SYS CUDA Programming and Performance	1	390	May 3, 2021
GPU required for RDMA GPUdirect RDMA Software For GPU	3	570	February 29, 2024
Example codes and reffrence for Rdma GPUDirect RDMA Software For GPU	1	790	March 1, 2024
If without Infiniband, how can I use GPUDirect RDMA to transfer data from NIC to GPU device bypass CPU and host memory? RDMA Software For GPU kernel	1	519	March 25, 2024
GPU Direct RDMA Help CUDA Programming and Performance	4	1416	November 22, 2020
How to use GPUDirect Async with Infiniband? RDMA Software For GPU	1	686	March 25, 2024

How to use GPU Direct RDMA with infiniband ConnectX-4?

Related topics