ConnectX-7 NDR200 card cannot get userspace level library for RDMA

Hi,

I just installed the MLNX_OFED_LINUX-4.9-7.1.0.0 driver for Nvidia ConnectX-7 NDR200/HDR QSFP112 2-port PCIe Gen5 x16 InfiniBand Adapter

But it fails to get ibv devices

ibv_devices

libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs3
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs2
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs1
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
device node GUID
------ ----------------

I install the driver with

./mlnxofedinstall --without-32bit --enable-affinity --hpc --without-iser --without-srp --without-opensm --with-infiniband-diags --without-fw-update --with-nfsrdma --force

I have the ibstat looks OK: for mlx5_0/1/2/3

ibstat

CA ‘mlx5_0’
CA type: MT4129
Number of ports: 1
Firmware version: 28.38.1002
Hardware version: 0
Node GUID: 0x946dae030060fd58
System image GUID: 0x946dae030060fd58
Port 1:
State: Active
Physical state: LinkUp
Rate: 200
Base lid: 712
LMC: 0
SM lid: 4
Capability mask: 0xa751e848
Port GUID: 0x946dae030060fd58
Link layer: InfiniBand
CA ‘mlx5_1’
CA type: MT4129
Number of ports: 1
Firmware version: 28.38.1002
Hardware version: 0
Node GUID: 0x946dae030060fd59
System image GUID: 0x946dae030060fd58
Port 1:
State: Active
Physical state: LinkUp
Rate: 200
Base lid: 713
LMC: 0
SM lid: 4
Capability mask: 0xa751e848
Port GUID: 0x946dae030060fd59
Link layer: InfiniBand

I have the kernel modules
lsmod | egrep -i “ib|rdma|verbs”
rdma_ucm 26934 0
ib_ucm 22566 0
rdma_cm 61162 1 rdma_ucm
iw_cm 43918 1 rdma_cm
ib_ipoib 176977 0
ib_cm 53064 3 rdma_cm,ib_ucm,ib_ipoib
ib_umad 27587 0
mlx5_ib 398193 0
ib_uverbs 134646 3 mlx5_ib,ib_ucm,rdma_ucm
mlx5_core 1175358 2 mlx5_ib,mlx5_fpga_tools
mlx4_ib 220791 0
ib_core 379768 10 rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
mlx4_core 361102 2 mlx4_en,mlx4_ib
mlx_compat 47141 15 rdma_cm,ib_cm,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,mlx5_fpga_tools,ib_ucm,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib
devlink 60067 4 mlx4_en,mlx4_ib,mlx4_core,mlx5_core

The ipoib seems to work I can ping other working nodes from the ib0 interface/address.

The library is there.

ls /lib64/libibverbs.so* -alht

lrwxrwxrwx 1 root root 19 Nov 29 10:36 /lib64/libibverbs.solibibverbs.so.1.0.0

lrwxrwxrwx 1 root root 19 Nov 29 10:35 /lib64/libibverbs.so.1libibverbs.so.1.0.0

-rwxr-xr-x 1 root root 103K Jun 6 11:05 /lib64/libibverbs.so.1.0.0

It appears some signing key error in the dmesg or syslog, but I think this is no harm.
[Wed Nov 29 10:44:57 2023] Request for unknown module key ‘Mellanox Technologies signing key: 61feb074fc7292f958419386ffdd9d5ca999e403’ err -11

I have reinstalled the driver, but it does not seem to work.

What could be wrong?

Thanks,

Wei

I tried MLNX_OFED_LINUX-5.8-2.0.3.0 driver and it seems to work. But those ib_send_bw commands not compatiable with versions older than 5.7. I have 4.9 drivers on other nodes.

So mofed 4.9 does not support ConnectX-7 ?

Hello Wei,

Thank you for reaching out to the NVIDIA Networking Community.

Please note that version 4.9-7.1.0.0 does not support ConnectX-7. For ConnectX-7 support, you will need MLNX_OFED version 5.4-3.7.5.0 or later. For details on supported NIC speeds, kindly refer to the ‘Supported NICs Speeds’ section in the release notes for both versions:

4.9-7.1.0.0 Release Notes: Release Notes - NVIDIA Docs
5.4-3.7.5.0 Release Notes: Release Notes - NVIDIA Docs
5.8-3.0.7.0 Release Notes: Release Notes - NVIDIA Docs
23.10-0.5.5 Release Notes: Release Notes - NVIDIA Docs

Julia Jin

Thank you and regards,
NVIDIA Networking Technical Support

Thank you so much for the solution.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.