Why can't I load MLX5 module in kernel 4.19

Hi,

When I intalled driver in my VM, everything looks good before I restart openibd, but loading MLX5 module failed.

[root@a31070219959 MLNX_OFED_LINUX-4.19.36]# uname -a

Linux a31070219959 4.19.36 #1 SMP Mon Jul 22 00:00:00 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

[root@a31070219959 MLNX_OFED_LINUX-4.19.36]# ./mlnxofedinstall --distro rhel7.3

Device (00:06.0):

00:06.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]

Link Width: x16

PCI Link Speed: 8GT/s

Installation finished successfully.

Preparing… ################################# [100%]

Updating / installing…

1:mlnx-fw-updater-4.6-1.0.1.1 ################################# [100%]

Added 'RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf

Attempting to perform Firmware update…

Querying Mellanox devices firmware …

Device #1:


Device Type: ConnectX4

Part Number: MCX456A-ECA_Ax

Description: ConnectX-4 VPI adapter card; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe3.0 x16; ROHS R6

PSID: MT_2190110032

PCI Device Name: 00:06.0

Base GUID: 248a070300398b86

Versions: Current Available

FW 12.25.1020 12.25.1020

PXE 3.5.0701 3.5.0701

UEFI 14.18.0019 14.18.0019

Status: Up to date

Log File: /tmp/MLNX_OFED_LINUX.165.logs/fw_update.log

To load the new driver, run:

/etc/init.d/openibd restart

[root@a31070219959 MLNX_OFED_LINUX-4.19.36]# /etc/init.d/openibd restart

Unloading HCA driver: [ OK ]

install: unrecognized option ‘–ignore-install’

Try ‘install --help’ for more information.

Loading Mellanox MLX5_IB HCA driver: [FAILED]

Loading HCA driver and Access Layer: [FAILED]

Please run /usr/sbin/sysinfo-snapshot.py to collect the debug information

and open an issue in the http://support.mellanox.com/SupportWeb/service_center/SelfService

#if I load mlx5_core.ko manually, the dmesg showed as follow:

[ 107.825031] mlx5_core 0000:00:06.0: firmware version: 12.25.1020

[ 107.825214] mlx5_core 0000:00:06.0: 126.016 Gb/s available PCIe bandwidth (8 GT/s x16 link)

[ 109.987278] mlx5_core 0000:00:06.0: Port module event: module 0, Cable unplugged

[ 110.125490] mlx5_core 0000:00:06.0: mlx5_fw_tracer_start:775:(pid 193): FWTracer: Ownership granted and active

[ 167.363884] mlx5_ib: Mellanox Connect-IB Infiniband driver v4.6-1.0.1

[ 167.380930] infiniband mlx5_0: mlx5_ib_gsi_create_qp:178:(pid 214): unable to create send CQ for GSI QP. error -22

[ 167.381125] infiniband mlx5_0: Couldn’t create ib_mad QP1

[ 167.383943] infiniband mlx5_0: Couldn’t open port 1

[ 167.385875] mlx5_core 0000:00:06.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(64) RxCqeCmprss(0)

[ 167.386036] mlx5_core 0000:00:06.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(64) RxCqeCmprss(0)

[ 167.747534] infiniband mlx5_0: Port 1 not found

[ 167.747628] infiniband mlx5_0: Couldn’t close port 1 for agents

[ 167.747720] infiniband mlx5_0: Port 1 not found

[ 167.747808] infiniband mlx5_0: Couldn’t close port 1

[ 192.465344] knem 1.1.3.90mlnx1: initialized

Needed some help here. As I am new to mellanox. Thanks a lot.

Hi Danni,

Please uninstall the current driver and try to install it with the “add-kernel-support” flag ./mlnxofedinstall --add-kernel-support --distro rhel7.3

Thanks,

Samer

Hi Samer,

Thanks for your reply, but it does’t work, still the same error.

Danni

Hi Danni,

Could you please try the following command ?

/etc/init.d/openibd force-restart

Thanks

Samer

Hi Samer,

Still the same error…

[root@4d4b60700913 MLNX_OFED_LINUX-4.19.36]# /etc/init.d/openibd force-restart

Unloading HCA driver: [ OK ]

install: unrecognized option ‘–ignore-install’

Try ‘install --help’ for more information.

Loading Mellanox MLX5_IB HCA driver: [FAILED]

Loading HCA driver and Access Layer: [FAILED]

Thanks,

Danni

Hi Danni,

Could you please provide more information about your setup ?

Hypervisor ? VM ? Which OS are you using in both ?

Is it Linux KVM / ESXi ?

Thanks,

Samer

Hi Samer,

I compile the MLNX driver in the host with 4.19.36 kernel source code.

And then start a qemu-kvm VM whose kernel version is 4.19.36, the OS run in the VM just a busybox initrd image.

After VM is started, I enter to the VM shell to install the driver.

I tried 4.1 before and it worked well.

Thanks,

Danni

Hi Danni,

Did you try to install MLNX_OFED 4.5 ? is it working ?

https://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers

Thanks,

Samer