How to solve a "FATAL: Module mst_pci not found in directory" ?

Hello,

Since We have installed infiniband on our servers with ConnectX-5 cards, I noticed that each time I upgrade linux, the mst driver returns an error like “FATAL: Module mst_pci not found in directory” which implies that I can’t no longer use the infiniband system until I re-install it.

Here is a fresh example to illustrate that problem on one server:

uname -a

→ Linux kauai 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

sudo apt-get update & sudo apt-get upgrade

sudo reboot

uname -a

Linux kauai 5.4.0-47-generic #51~18.04.1-Ubuntu SMP Sat Sep 5 14:35:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

sudo -s

(root)# opensm &

(root)# /etc/init.d/opensmd start

(root)# mst start

→ “Loading MST PCI modulemodprobe: FATAL: Module mst_pci not found in directory /lib/modules/5.4.0-47-generic”

I should add that the installation of mlnxofed (MLNX_OFED_LINUX-5.1-0.6.6.0:) was made with this command:

sudo ./mlnxofedinstall --enable-opensm --add-kernel-support

Is there another solution to make mlnxofed installation “robust” to the upgrading of ubuntu?

Thank you in advance for any advice.

Regards,

Clément

Hi Clément,

This is expected behavior since every time you install the MLNX_OFED it’s being compiled with the current OS and Kernel modules, if you upgrade the OS/Kernel modules the OFED/OpenSM won’t work with the new modules.

Therefore each time you upgrade the OS/Kernel modules you will need to reinstall MLNX_OFED .

Thanks,

Samer

Hi Samer,

Thank you for your answer. From now on, I won’t upgrade so regularly my ubuntu !

Have a nice week.

Clément