Is this issue still actual to you?
I had the same issue on ConnectX-3 Pro, caused by large number of VFs I put into the HCA configuration file.
Symptoms (dmesg output):
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: [15b3:1007] type 00 class 0x028000
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x10: [mem 0xf7900000-0xf79fffff 64bit]
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x18: [mem 0xf4000000-0xf47fffff 64bit pref]
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x30: [mem 0xf7800000-0xf78fffff pref]
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x134: [mem 0x00000000-0x007fffff 64bit pref]
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: VF(n) BAR2 space: [mem 0x00000000-0x03ffffff 64bit pref] (contains BAR2 for 8 VFs)
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: BAR 2: no space for [mem size 0x00800000 64bit pref]
Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: BAR 2: failed to assign [mem size 0x00800000 64bit pref]
…
Mar 31 21:09:48 macpro kernel: mlx4_core: Mellanox ConnectX core driver v4.3-1.0.1
Mar 31 21:09:48 macpro kernel: mlx4_core: Initializing 0000:06:00.0
Mar 31 21:09:48 macpro kernel: mlx4_core 0000:06:00.0: enabling device (0100 → 0102)
Mar 31 21:09:48 macpro kernel: mlx4_core 0000:06:00.0: Missing UAR, aborting
Solution:
- Collect FW-related information about your HCA, you need to know PSID
# mstflint -d 06:00.0 q full
Image type: FS2
FW Version: 2.42.8016
FW Release Date: 21.3.2018
MIC Version: 2.0.0
Config Sectors: 2
PRS Name: cx3pro_MCX354A_fdr_09v.prs
Rom Info: type=PXE version=3.4.752
Device ID: 4103
…
PSID: MT_1090111019
- Download FW for your particular HCA and unpack it.
-
In my case it is ConnectX3Pro-rel-2_42_6000-web.tgz
-
the archive contains set of *.ini files for different HCAs based on the same chip.
-
find the configuration file for your PSID:
# grep MT_1090111019 *.ini
MCX354A-FCC_Ax.ini:PSID = MT_1090111019
- Dump HCA configuration file, compare it to the original configuration file:
# mstflint -d 06:00.0 dc current.ini
# diff -u MCX354A-FCC_Ax.ini current.ini
- Next, add a few parameters into the [HCA] section to disable sr_iov and reduce the number of VFs to 4 (safe value in my case):
[HCA]
hca_header_subsystem_id = 0x0003
hca_header_device_id = 0x1007
dpdp_en = true
eth_xfi_en = true
mdio_en_port1 = 0
num_pfs = 1
total_vfs = 4
sriov_en = false
- Create the new firmware image, which includes the new configuration file
*# mlxburn -fw ./fw-ConnectX3Pro-rel.mlx -c current-sriov.ini *
-wrimage ./fw-ConnectX3Pro-rel-2_42_8016-MCX354A-FCC_Ax-FlexBoot-3.4.752.bin
- Burn it
mlxfwmanager -d /dev/mst/mt4103_pciconf0 -i fw-ConnectX3Pro-rel-2_42_8016-MCX354A-FCC_Ax-FlexBoot-3.4.752.bin -u
- Reboot and check that mlx4_core driver loads now for your HCA.