Mlx5_ib and mlx5_core cannot load in mlnx_ofed 5.9 kernel 5.15

Hi,

I successfully installed mlnx_ofed 5.9 (even tried with 5.8 and 5.4). But when I try to run sudo /etc/init.d/openibd restart (or force-restart), I get the following error:

Unloading HCA driver: [ OK ]
Loading Mellanox MLX5_IB HCA driver: [FAILED]
Loading Mellanox MLX5 HCA driver: [FAILED]
Loading HCA driver and Access Layer: [FAILED]

Please run /usr/sbin/sysinfo-snapshot.py to collect the debug information
and open an issue in the http://support.mellanox.com/SupportWeb/service_center/SelfService

Notes: my system: ubuntu 22.04, kernel 5.15.60 generic, NIC: Mellanox Connectx-4.

Thanks a lot in advance.

Hi nnaza008,

Thank you for contacting Nvidia Support. When compiling MLNX OFED drivers for a different kernel version (non-default), please make sure to compile the drivers with the --add-kernel-support flag.

The default kernel versions with which the drivers have been compiled and tested are listed in the Release notes - https://docs.nvidia.com/networking/display/MLNXOFEDv590560/General+Support

For the list of installation options, please run: ./mlnxofedinstall --help

  1. Please uninstall the unsuccessful MLNX OFED installation with → # ./uninstall.sh
  2. Compile and install OFED with kernel support → # ./mlnxofedinstall --add-kernel-support
  3. Restart the drivers to make sure that there are no errors → # /etc/init.d/openibd restart
  4. Check the modinfo for Mellanox kernel modules → # modinfo mlx5_core | grep filename
    filename: /lib/modules/5.3.18-150300.59.106-default/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko.xz

Make sure that the filename used is located under the “extra” directory as above.

Thanks,
Nvidia Support

I had the same problem with ofed 5.9 on ubuntu 22.04.
Turns out that ubuntu secure boot was preventing the kernel module from loading.
Followed this link to disable secure boot. Now it works.