Host Driver Initialization error (FAIL)

Hello Mellanox Support,

I installed a fresh Ubuntu 16.04.02 LTS, and the Mellanox OFED 16.04 Linux driver from iso file.

The installation is smooth and can proceed successfully.

However, when I run hca_self_test.ofed, it reports:

“Host Driver Initialization FAILed”

run sudo /etc/init.d/openibd restart, it also reports loading driver error.

Please see enclosed the error pictures.

and the sys log as requested.

Please advise.

sysinfo-snapshot-v3.1.8-dynamicc4-20170513-1153.tgz (3.37 MB)

Hi Sophie,

Thank you for your kind support!

this case happened when I installed the driver from driver source code (install.pl), although the installation procedure was smooth, probably it did not load the drivers successfully. that is why it reports the error.

I made a re-freshed installation from mlnxofedinstall script, it works and I can use the card now.

Please consider this case closed.

BTW, one more question, for the linux drivers for infiniband, are these drivers provided by Mellanox in the package, or is it from the Linux distribution?

I checked the /lib/modules/kernels/drivers/infiniband/, it seems these kernel drivers’ date are old(not on the compilation date), so I think it is not from source code re-build?

Thank you

Mei Guodong

Hi Guodong,

Indeed our Community website is to assist our customers though if you would like to inquire further about our support contract options, please send an email to "contracts@mellanox.com mailto:contracts@mellanox.com ".

Regards,

Sophie.

Hello Sophie,

We have recently purchased some Connectx-3 boards through your distributor. but there is no service contract.

Could you elaborate more on how your service contract work?

It was in my understanding the Support Community is here to support customers?

Thank you.

Mei Guodong

Hi Guodong,

Looking at the sysinfo-snapshot provided, none of our modules are loaded.

IE:

lsmod | egrep -i “ib|mlx*”

ib_ucm 22642 0

ib_ipoib 159750 0

ib_cm 52470 3 ib_ucm,rdma_cm,ib_ipoib

ib_uverbs 71505 2 rdma_ucm,ib_ucm

ib_umad 22283 6

mlx4_en 134317 0

mlx4_ib 193439 0

mlx4_core 353345 2 mlx4_en,mlx4_ib

mlx5_ib 188935 0

ib_core 250100 10 rdma_ucm,ib_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_uverbs,ib_umad,mlx4_ib,mlx5_ib

ipv6 361510 69 bridge,ip6t_REJECT,rdma_cm,ib_ipoib,ib_core

mlx5_core 547647 1 mlx5_ib

mlx_compat 17075 14 rdma_ucm,ib_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_uverbs,ib_umad,mlx4_en,mlx4_ib,mlx4_core,mlx5_ib,ib_core,mlx5_core

ptp 18580 3 mlx4_en,mlx5_core,igb

libahci 32073 1 ahci

libsas 84132 1 isci

scsi_transport_sas 40863 2 isci,libsas

The syslog file reports odd messages on the drivers version upon loading then fail to load:

May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX4 HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX4_IB HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX4_EN HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX5 HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX5_IB HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

Can you validate the modules versions:

IE:

modinfo mlx4_core | grep -i version

version: 4.0-2.0.0

srcversion: 8D664781D9FEAD80E98F82E

vermagic: 3.10.105-1.el6.elrepo.x86_64 SMP mod_unload modversions

Also, can you compare the srcversion between the modinfo and the actual modules, they should be the same:

IE:

cat /sys/module/mlx4_core/srcversion

8D664781D9FEAD80E98F82E

I would also suggest to verify the content of your initramfs image (lsinitrd), check modules and versions of our MLX drivers.

Make sure all Inbox drivers have been removed.

Sophie.

Hi Guodong,

Would you happened to have a service contract with Mellanox?

Thank you,

Sophie.

Hi Guodong,

You are very welcome.

The /lib/modules//extra/mlnx-ofa_kernel/drivers/infiniband/core are the modules provided by Mellanox OFED Driver and in use when you install our drivers.

The /lib/modules/kernel/drivers/infiniband/core are the modules from the Inbox driver (embedded into the OS) and are no longer in use by the Kernel though originally compiled from initial installation of the OS/Kernel.

Regards,

Sophie.