No ethernet devices showing up under Ubuntu with inbox drivers

I’m having trouble on a machine that was previously running Ubuntu 16.10 and was just upgraded today to 17.04 using ConnectX-3 cards. This machine is using a Super Micro X8SIA-F motherboard. I have another machine running a more modern motherboard that was also on 16.10 with the inbox drivers and was upgraded to 17.04 that is using a ConnectX-3 card. The problem I’m having with the older rig is no Ethernet devices show up for the ConnectX-3 card (meaning when I run ifconfig or ip addr, no devices for the card show up).

The card seems to be getting detected Ok (output from lspci):

05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

Output from lshw:

*-pci:1 description: PCI bridge product: HIO524G2 PCI Express Gen2 Switch vendor: Integrated Device Technology, Inc. [IDT] physical id: 4 bus info: pci@0000:03:04.0 version: 02 width: 32 bits clock: 33MHz capabilities: pci pciexpress pm msi normal_decode bus_master cap_list configuration: driver=pcieport resources: irq:35 memory:fb000000-fb1fffff *-network UNCLAIMED description: Network controller product: MT27500 Family [ConnectX-3] vendor: Mellanox Technologies physical id: 0 bus info: pci@0000:05:00.0 version: 00 width: 64 bits clock: 33MHz capabilities: pm vpd msix pciexpress cap_list configuration: latency=0 resources: memory:fb000000-fb0fffff memory:fb100000-fb1fff ff

The unique thing about the lshw output is the “UNCLAIMED” bit after network.

Necessary modules seem to be loaded:

$ sudo lsmod | grep mlx mlx4_en 114688 0 ptp 20480 2 e1000e,mlx4_en mlx4_core 294912 1 mlx4_en devlink 32768 2 mlx4_en,mlx4_core

I thought maybe I needed to edit the port type by writing into /sys/bus/pci/devices/0000:05:00.0/, but I couldn’t create the mlx4_port1 or mlx4_port2 files when logged in as root. I even tried runlevel 1, but no matter what , I was getting permission denied errors. On my other machine, those files exist by default. Not sure, what to do at this point. I know it’s not a card issue because I swapped a card from the working machine and symptoms were the same.

If anyone knows what the possible issue might be, I’d appreciate any insight.

Yes, I’m using the inbox drivers. The following shows the errors trying to configure the card. From searching online, Red Hat suggests enabling SR-IOV, which my board doesn’t support.

[0.314907] pci 0000:05:00.0: [15b3:1003] type 00 class 0x028000[0.315202] pci 0000:05:00.0: reg 0x10: [mem 0xfb200000-0xfb2fffff 64bit][0.315415] pci 0000:05:00.0: reg 0x18: [mem 0xf4000000-0xf5ffffff 64bit pref][0.315811] pci 0000:05:00.0: reg 0x30: [mem 0xfb100000-0xfb1fffff pref][0.317451] pci 0000:05:00.0: reg 0x134: [mem 0x00000000-0x01ffffff 64bit pref][0.317453] pci 0000:05:00.0: VF(n) BAR2 space: [mem 0x00000000-0x1fffffff 64bit pref] (contains BAR2 for 16 VFs)[0.386115] system 00:05: [io 0x164e-0x164f] has been reserved[0.386121] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active)[0.386768] pnp 00:0a: disabling [mem 0x000c0000-0x000cffff] because it overlaps 0000:05:00.0 BAR 9 [mem 0x00000000-0x1fffffff 64bit pref][0.386776] pnp 00:0a: disabling [mem 0x000e0000-0x000fffff] because it overlaps 0000:05:00.0 BAR 9 [mem 0x00000000-0x1fffffff 64bit pref][0.393375] pci 0000:05:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref][0.393381] pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref][0.393668] pci 0000:05:00.0: BAR 2: no space for [mem size 0x02000000 64bit pref][0.393673] pci 0000:05:00.0: BAR 2: failed to assign [mem size 0x02000000 64bit pref][0.393679] pci 0000:05:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref][0.393685] pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref][0.394022] pci 0000:05:00.0: BAR 2: no space for [mem size 0x02000000 64bit pref][0.394027] pci 0000:05:00.0: BAR 2: failed to assign [mem size 0x02000000 64bit pref][0.394033] pci 0000:05:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref][0.394039] pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref][0.394045] pci 0000:05:00.0: BAR 0: assigned [mem 0xfb000000-0xfb0fffff 64bit][0.394184] pci 0000:05:00.0: BAR 6: assigned [mem 0xfb100000-0xfb1fffff pref][0.394607] pci 0000:05:00.0: res[9]=[mem size 0x00000000 64bit pref] res_to_dev_res add_size 20000000 min_align 0[0.394609] pci 0000:05:00.0: BAR 2: no space for [mem size 0x02000000 64bit pref][0.394615] pci 0000:05:00.0: BAR 2: failed to assign [mem size 0x02000000 64bit pref][0.394621] pci 0000:05:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref][0.394626] pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref][0.394632] pci 0000:05:00.0: BAR 0: assigned [mem 0xfb000000-0xfb0fffff 64bit][0.394770] pci 0000:05:00.0: BAR 6: assigned [mem 0xfb100000-0xfb1fffff pref][0.394776] pci 0000:05:00.0: BAR 2: no space for [mem size 0x02000000 64bit pref][0.394782] pci 0000:05:00.0: BAR 2: failed to assign [mem size 0x02000000 64bit pref][0.394788] pci 0000:05:00.0: BAR 0: assigned [mem 0xfb000000-0xfb0fffff 64bit][0.394931] pci 0000:05:00.0: BAR 6: assigned [mem 0xfb100000-0xfb1fffff pref][0.394937] pci 0000:05:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref][0.394942] pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref][0.395112] pci_bus 0000:05: resource 1 [mem 0xfb000000-0xfb1fffff][6.898695] mlx4_core: Initializing 0000:05:00.0

Hi Josh,

There’s no official MOFED driver for Ubuntu 17.04, the latest Ubuntu version supported is 16.10.

Is it also the inbox driver ?

How did you install the driver ?

Can you send the output of ofed_info -s

Can you show us the dmesg to see if no error was printed during loading of the driver ?

Marc

Based on that same rhel article, Mellanox Driver fails to load on 3.6.11.5-rt37.55.el6rt.x86_64 - Red Hat Customer Portal I added pci=realloc=off to the kernel boot parameters and things appear to be better:

[0.313593] pci 0000:04:00.0: [9005:0285] type 00 class 0x010400[0.329375] pci 0000:05:00.0: [15b3:1003] type 00 class 0x028000[0.329670] pci 0000:05:00.0: reg 0x10: [mem 0xfb200000-0xfb2fffff 64bit][0.329883] pci 0000:05:00.0: reg 0x18: [mem 0xf4000000-0xf5ffffff 64bit pref][0.330276] pci 0000:05:00.0: reg 0x30: [mem 0xfb100000-0xfb1fffff pref][0.331918] pci 0000:05:00.0: reg 0x134: [mem 0x00000000-0x01ffffff 64bit pref][0.331920] pci 0000:05:00.0: VF(n) BAR2 space: [mem 0x00000000-0x1fffffff 64bit pref] (contains BAR2 for 16 VFs)[0.416549] system 00:05: [io 0x164e-0x164f] has been reserved[0.416555] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active)[0.417205] pnp 00:0a: disabling [mem 0x000c0000-0x000cffff] because it overlaps 0000:05:00.0 BAR 9 [mem 0x00000000-0x1fffffff 64bit pref][0.417213] pnp 00:0a: disabling [mem 0x000e0000-0x000fffff] because it overlaps 0000:05:00.0 BAR 9 [mem 0x00000000-0x1fffffff 64bit pref][0.423814] pci 0000:05:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref][0.423820] pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref][0.424004] pci_bus 0000:05: resource 1 [mem 0xfb100000-0xfb2fffff][0.424005] pci_bus 0000:05: resource 2 [mem 0xf4000000-0xf5ffffff 64bit pref][6.468103] mlx4_core: Initializing 0000:05:00.0[7.692569] hid-generic 0003:046D:C526.0005: input,hidraw4: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:1d.0-1.2/input0

[ 12.790631] mlx4_core 0000:05:00.0: PCIe BW is different than device’s capability

[ 12.790632] mlx4_core 0000:05:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s

[ 12.790633] mlx4_core 0000:05:00.0: PCIe link width is x8, device supports x8

[ 19.291395] mlx4_en 0000:05:00.0: Activating port:1

[ 19.327353] mlx4_en: 0000:05:00.0: Port 1: Using 64 TX rings

[ 19.359889] mlx4_en: 0000:05:00.0: Port 1: Using 8 RX rings

[ 19.391439] mlx4_en: 0000:05:00.0: Port 1: frag:0 - size:1522 prefix:0 stride:1536

[ 19.423483] mlx4_en: 0000:05:00.0: Port 1: Initializing port

[ 19.787708] mlx4_en 0000:05:00.0: registered PHC clock

[ 19.824488] mlx4_en 0000:05:00.0: Activating port:2

[ 19.864175] mlx4_en: 0000:05:00.0: Port 2: Using 64 TX rings

[ 19.900519] mlx4_en: 0000:05:00.0: Port 2: Using 8 RX rings

[ 19.936003] mlx4_en: 0000:05:00.0: Port 2: frag:0 - size:1522 prefix:0 stride:1536

[ 19.971627] mlx4_en: 0000:05:00.0: Port 2: Initializing port

[ 20.274152] mlx4_core 0000:05:00.0 enp5s0: renamed from eth0

[ 20.361203] mlx4_core 0000:05:00.0 enp5s0d1: renamed from eth1

I’m hoping there is a more elegant solution.

Hi Josh,

Can you see correctly your device now via ifconfig, ip link show, after boot ?

Marc