ConnectX-6 DX missing the RDMA

Hello,
I have two identical servers out of which one is having problems with the RDMA. The problem was already described well in the ConnectX-5 25GbE missing RDMA devices - Adapters and Cables / Ethernet Adapter Cards - NVIDIA Developer Forums

But, the problem was not solved in that topic. I have Mellanox card with Part Number MCX623106PE-CDAT, so I am using the firmware 22.37.1014 (MT_0000000606) and driver 5.8-1.1.2. This should be according to the recommendation. But it still does not work.

I even replaced it with the Mellanox ConnectX-6 with Part Number 0F6FXM (DELL) and the problem is the same.

The strange thing is that it is working on the other server and it worked on the “faulty” server before with the same configuration. I have a fresh Ubuntu 22.04 (5.15.0-75-generic) installation, so there is nothing installed that could prevent it from working.
Any suggestions on how to fix this?

Pls download and install OFED driver.

https://docs.nvidia.com/networking/display/MLNXOFEDv590560/Installation

Hi @xiaofengl, as you can see in my description of the problem, I have the MLNX OFED installed. In fact, I believe I have a correct version installed.

One thing that has mitigated the problem right now, is when I disabled the SRIOV for that particular Mellanox card. But it is really strange, that it has worked before. So, no idea what the problem might be, but at least after SRIOV is disabled it works.