On ubuntu14.04, Loading Mellanox MLX4 HCA driver: [FAILED]

Hi,

I have an AMD machine and I am using connectX-4 cards. Though they are showing up in lspci and even when my installation of OFED (MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu14.04-x86_64, tried 4.2 as well) gets successfull.

/etc/init.d/openibd restart shows following output

Unloading HCA driver: [ OK ]

Loading Mellanox MLX4 HCA driver: [FAILED]

Loading Mellanox MLX4_IB HCA driver: [FAILED]

Loading Mellanox MLX4_EN HCA driver: [FAILED]

Loading Mellanox MLX5 HCA driver: [FAILED]

Loading Mellanox MLX5_IB HCA driver: [FAILED]

Loading Mellanox MLX5 FPGA Tools driver: [FAILED]

Loading HCA driver and Access Layer: [FAILED].

kernel version:3.13.0-153-generic, ubuntu 14.04 .5 LTS

Needed some help here. As I am new to mellanox.

able to solve it my using the hack in /usr/src/linux-headers-3.13.0-144/include/linux/vermagic.h

commenting line that contains #ifdef RETPOLINE .

However dmesg is still showing error

mlx5_core 0000:0a:00.0: Missing registers BAR, aborting

mlx5_core 0000:0a:00.0: error requesting BARs, aborting

mlx5_core 0000:0a:00.0: mlx5_pci_init failed with error code -19.

but mstflint is working.

mstflint -d 0a:00.0 q have following output

Image type: FS3

FW Version: 12.25.1020

FW Release Date: 30.4.2019

Product Version: 12.25.1020

Rom Info: type=UEFI version=14.18.19 cpu=AMD64,AARCH64

type=PXE version=3.5.701 cpu=AMD64

Description: UID GuidsNumber

Base GUID: 98039b030091bf42 4

Base MAC: 98039b91bf42 4

Image VSD: N/A

Device VSD: N/A

PSID: MT_2130110027

Security Attributes: N/A

If anyone have any idea please let me know what to do in this case.

Hello Avinash,

Many thanks for posting your issue on the Mellanox Community.

Based on the information provided, I did a quick test in our lab, because the kernel you are using is not the default kernel which comes with Ubuntu 14.04.5 (comes with 4.4.0-31-generic).

Based on your kernel version, I was able to successfully install the driver and had no issues when starting the driver.

Output:

ibv_devinfo && lsb_release -ra && uname -r

hca_id: mlx5_0

transport: InfiniBand (0)

fw_ver: 12.25.4062

node_guid: 248a:0703:003e:2f5a

sys_image_guid: 248a:0703:003e:2f5a

vendor_id: 0x02c9

vendor_part_id: 4115

hw_ver: 0x0

board_id: MT_2190110032

phys_port_cnt: 1

port: 1

state: PORT_DOWN (1)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 14.04.5 LTS

Release: 14.04

Codename: trusty

3.13.0-153-generic

We recommend to do a full uninstall of the driver and reinstall the driver, make sure that also the header DEB is installed for this kernel.

If after the installation, you are experiencing the same issue, then we would recommend to look into your BIOS settings as it involves BAR assignment on the PCIe bus.

Many thank,

~Mellanox Technical Support