ConnectX-5 25GbE missing RDMA devices

On a Ubuntu 20.04 system we have a ConnectX-5 dual port 25GbE adapter (MCX512F-ACAT). tcp Communication over a port of this card is possible, but we see no rdma device.

# ibv_devinfo
No IB devices found

system: Ubuntu 20.04.4
kernel 5.4.0-100-generic (5.4.0-126-generic also tested)
MOFED 5.4-3.5.8.0-ubuntu20.04
adapter FW: 16.34.1002

on boot we see some messages, which we do not have on other nodes:

[ 2794.490435] infiniband mlx5_0: mlx5_ib_test_wc:392:(pid 1000939): Error -110 while trying to test write-combining support
[ 2794.700217] mlx5_ib.rdma: probe of mlx5_core.rdma.0 failed with error -12

thanks for any idea, Ralf

Hi Ralf,

Thank you for contacting us.
After looking at the details you added in the forum I can see that your MOFD version is not supported by the FW you are using.
Please review this link:
https://docs.nvidia.com/networking/display/ConnectX5Firmwarev16341002/Firmware+Compatible+Products#FirmwareCompatibleProducts-DriverSoftware,ToolsandSwitchFirmware

Installing the supported version of MOFED should fix your issue.

Thanks and have a great day!
Ilan.

Hi Ilan,

thanks for your answer. I looked up the docs at the provided link and installed MOFED 5.7-1.0.2.0 corresponding to the fw version 16.34.1002. I still get the ibv_devinfo output “no devices” …
I also tried with downgraded firmware 16.31.2006 and MOFED 5.4-3.5.8.0 which also did not show rdma devices.
What else can I check? Besides ordering a Dell OEM mlx adapter for this machine…
Kind regards, Ralf

We now installed a Dell OEM ConnectX-5 card, P/N 04TRD3 and now we see the RDMA devices. This card shows PSID DEL0000000016,
while the other card shows MT_0000000183

best regards, Ralf

The MCX512F-ACAT (MT_0000000183) is now running in an older
Primergy RX200 S7 server and ibv_devices shows the devices there.
best regards, Ralf