Hello,
Since We have installed infiniband on our servers with ConnectX-5 cards, I noticed that each time I upgrade linux, the mst driver returns an error like “FATAL: Module mst_pci not found in directory” which implies that I can’t no longer use the infiniband system until I re-install it.
Here is a fresh example to illustrate that problem on one server:
uname -a
→ Linux kauai 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
sudo apt-get update & sudo apt-get upgrade
sudo reboot
uname -a
Linux kauai 5.4.0-47-generic #51~18.04.1-Ubuntu SMP Sat Sep 5 14:35:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
sudo -s
(root)# opensm &
(root)# /etc/init.d/opensmd start
(root)# mst start
→ “Loading MST PCI modulemodprobe: FATAL: Module mst_pci not found in directory /lib/modules/5.4.0-47-generic”
I should add that the installation of mlnxofed (MLNX_OFED_LINUX-5.1-0.6.6.0:) was made with this command:
sudo ./mlnxofedinstall --enable-opensm --add-kernel-support
Is there another solution to make mlnxofed installation “robust” to the upgrading of ubuntu?
Thank you in advance for any advice.
Regards,
Clément