Nvidia-fabricmanager fail with ib_umad module

Hello,

I’m currently facing an issue with nvidia-fabricmanager and would like to ask for some guidance.

I’m using a system equipped with an HGX-B200 GPU board and ConnectX-7, with the following software versions:

  • OS: Rocky Linux 9.6

  • Kernel: 5.14.0

  • NVIDIA Driver: 580.105.08

  • NVIDIA Fabric Manager: 580.105.08

  • NVIDIA NSCQ: 580.105.08

  • DOCA: DOCA Host (doca-all) 3.2.0 LTS

The problem is that nvidia-fabricmanager fails to start because the ib_umad module is not loaded automatically.
If I load the ib_umad module manually, Fabric Manager starts and works normally.

I would like to know what might be causing this issue, or if there are any known configuration requirements for automatic module loading.

In addition, I’d like to confirm whether the nvlsm package is mandatory when using Fabric Manager on systems with HGX-B200.
On our previous system equipped with HGX-H200, we did not experience this issue.

Any insight or recommendations would be greatly appreciated.
Thank you!