Ubuntu does not probe VFs on ConnectX-3 Infiniband HCA

I have a Ubuntu 16 server with a single port ConnectX-3 HCA. I have enabled SRIOV with 8 VFs on the HCA and configured the kernel with ‘intel_iommu=on’. /etc/modprobe.d/mlx.conf is configured to load and probe all eight VFs. The the VFs are listed by lspci, but the system does not probe the VFs and create the virtual interfaces on the system. dmesg indicates “Skipping virtual function” for all the VFs. All instructions for configuring VFs I have read indicate my configuration should probe the VFs. Can anyone point me towards what else needs to be configured to enable the VFs?

Below are the details of the system and HCA configuration:

uname -a

Linux cfmm-h2 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 14:24:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

lspci | grep Mellanox

05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

05:00.1 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:00.2 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:00.3 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:00.4 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:00.5 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:00.6 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:00.7 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

05:01.0 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

cat /etc/modprobe.d/mlx4.conf

options mlx4_core num_vfs=8 probe_vf=8 port_type_array=1

mlx4_core gets automatically loaded, load mlx4_en also (LP: #1115710)

softdep mlx4_core post: mlx4_en

mstflint -d 05:00.0 q

Image type: FS2

FW Version: 2.36.5000

Product Version: 02.36.50.00

Rom Info: type=PXE version=3.4.718 devid=4099

Device ID: 4099

Description: Node Port1 Port2 Sys image

GUIDs: 248a070300ba8e20 248a070300ba8e21 248a070300ba8e22 248a070300ba8e23

MACs: 248a07ba8e21 248a07ba8e22

VSD:

PSID: DEL1100001019

mstconfig -d 05:00.0 q

continued…

Device #1:


Device type: ConnectX3

PCI device: 05:00.0

Configurations: Current

SRIOV_EN 1

NUM_OF_VFS 8

LINK_TYPE_P1 3

LINK_TYPE_P2 3

LOG_BAR_SIZE 3

BOOT_PKEY_P1 0

BOOT_PKEY_P2 0

BOOT_OPTION_ROM_EN_P1 0

BOOT_VLAN_EN_P1 0

BOOT_RETRY_CNT_P1 0

LEGACY_BOOT_PROTOCOL_P1 0

BOOT_VLAN_P1 1

BOOT_OPTION_ROM_EN_P2 0

BOOT_VLAN_EN_P2 0

BOOT_RETRY_CNT_P2 0

LEGACY_BOOT_PROTOCOL_P2 0

BOOT_VLAN_P2 1

lspci -vv -s 05:00.0

05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]

Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-

Latency: 0, Cache Line Size: 32 bytes

Interrupt: pin A routed to IRQ 83

Region 0: Memory at 92400000 (64-bit, non-prefetchable) [size=1M]

Region 2: Memory at 38000800000 (64-bit, prefetchable) [size=8M]

Expansion ROM at [disabled]

Capabilities: [40] Power Management version 3

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

Capabilities: [48] Vital Product Data

Product Name: CX353A - ConnectX-3 QSFP

Read-only fields:

[PN] Part number: 079DJ3

[EC] Engineering changes: A03

[SN] Serial number: IL079DJ37403172G0033

[V0] Vendor specific: PCIe Gen3 x8

[RV] Reserved: checksum good, 0 byte(s) reserved

Read/write fields:

[V1] Vendor specific: N/A

[YA] Asset tag: N/A

[RW] Read-write area: 104 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 253 byte(s) free

[RW] Read-write area: 252 byte(s) free

End

continued…

Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-

Vector table: BAR=0 offset=0007c000

PBA: BAR=0 offset=0007d000

Capabilities: [60] Express (v2) Endpoint, MSI 00

DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited

ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+

DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+

RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-

MaxPayload 256 bytes, MaxReadReq 4096 bytes

DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-

LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited

ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+

LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+

ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported

DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled

LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-

Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

Compliance De-emphasis: -6dB

LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+

EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-

Capabilities: [c0] Vendor Specific Information: Len=18 <?>

Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)

ARICap: MFVC- ACS-, Next Function: 0

ARICtl: MFVC- ACS-, Function Group: 0

Capabilities: [148 v1] Device Serial Number 24-8a-07-03-00-ba-8e-20

Capabilities: [154 v2] Advanced Error Reporting

UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

UESvrt: DLP+ SDES- TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-

CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+

CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+

AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+

Capabilities: [18c v1] #19

Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)

IOVCap: Migration-, Interrupt Message Number: 000

IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy+

IOVSta: Migration-

Initial VFs: 8, Total VFs: 8, Number of VFs: 8, Function Dependency Link: 00

VF offset: 1, stride: 1, Device ID: 1004

Supported Page Size: 000007ff, System Page Size: 00000001

Region 2: Memory at 0000038001000000 (64-bit, prefetchable)

VF Migration: offset: 00000000, BIR: 0

Kernel driver in use: mlx4_core

Kernel modules: mlx4_core

dmesg (edited for mlx4 related lines)

[ 0.000000] Command line: BOOT_IMAGE=/ROOT/ubuntu@/boot/vmlinuz-4.4.0-98-generic root=ZFS=rpool/ROOT/ubuntu ro swapaccount=1 intel_iommu=on

[ 3.540033] mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)

[ 3.546709] mlx4_core: Initializing 0000:05:00.0

[ 9.511353] mlx4_core 0000:05:00.0: Enabling SR-IOV with 8 VFs

[ 9.620597] pci 0000:05:00.1: [15b3:1004] type 00 class 0x028000

[ 9.627586] pci 0000:05:00.1: Max Payload Size set to 256 (was 128, max 512)

[ 9.637048] iommu: Adding device 0000:05:00.1 to group 49

[ 9.643872] mlx4_core: Initializing 0000:05:00.1

[ 9.650716] mlx4_core 0000:05:00.1: enabling device (0000 → 0002)

[ 9.658389] mlx4_core 0000:05:00.1: Skipping virtual function:1

[ 9.665756] pci 0000:05:00.2: [15b3:1004] type 00 class 0x028000

[ 9.672748] pci 0000:05:00.2: Max Payload Size set to 256 (was 128, max 512)

[ 9.682097] iommu: Adding device 0000:05:00.2 to group 50

[ 9.688952] mlx4_core: Initializing 0000:05:00.2

[ 9.695814] mlx4_core 0000:05:00.2: enabling device (0000 → 0002)

[ 9.703545] mlx4_core 0000:05:00.2: Skipping virtual function:2

[ 9.710846] pci 0000:05:00.3: [15b3:1004] type 00 class 0x028000

[ 9.717871] pci 0000:05:00.3: Max Payload Size set to 256 (was 128, max 512)

[ 9.727364] iommu: Adding device 0000:05:00.3 to group 51

[ 9.734288] mlx4_core: Initializing 0000:05:00.3

[ 9.741154] mlx4_core 0000:05:00.3: enabling device (0000 → 0002)

[ 9.748832] mlx4_core 0000:05:00.3: Skipping virtual function:3

[ 9.756277] pci 0000:05:00.4: [15b3:1004] type 00 class 0x028000

[ 9.763268] pci 0000:05:00.4: Max Payload Size set to 256 (was 128, max 512)

[ 9.772810] iommu: Adding device 0000:05:00.4 to group 52

[ 9.779724] mlx4_core: Initializing 0000:05:00.4

[ 9.786588] mlx4_core 0000:05:00.4: enabling device (0000 → 0002)

[ 9.794256] mlx4_core 0000:05:00.4: Skipping virtual function:4

[ 9.801504] pci 0000:05:00.5: [15b3:1004] type 00 class 0x028000

[ 9.808499] pci 0000:05:00.5: Max Payload Size set to 256 (was 128, max 512)

[ 9.817815] iommu: Adding device 0000:05:00.5 to group 53

[ 9.824644] mlx4_core: Initializing 0000:05:00.5

[ 9.831359] mlx4_core 0000:05:00.5: enabling device (0000 → 0002)

[ 9.838894] mlx4_core 0000:05:00.5: Skipping virtual function:5

[ 9.846081] pci 0000:05:00.6: [15b3:1004] type 00 class 0x028000

[ 9.853086] pci 0000:05:00.6: Max Payload Size set to 256 (was 128, max 512)

[ 9.862328] iommu: Adding device 0000:05:00.6 to group 54

[ 9.868962] mlx4_core: Initializing 0000:05:00.6

[ 9.875601] mlx4_core 0000:05:00.6: enabling device (0000 → 0002)

[ 9.883002] mlx4_core 0000:05:00.6: Skipping virtual function:6

[ 9.890082] pci 0000:05:00.7: [15b3:1004] type 00 class 0x028000

[ 9.897070] pci 0000:05:00.7: Max Payload Size set to 256 (was 128, max 512)

[ 9.906218] iommu: Adding device 0000:05:00.7 to group 55

[ 9.912806] mlx4_core: Initializing 0000:05:00.7

[ 9.919380] mlx4_core 0000:05:00.7: enabling device (0000 → 0002)

[ 9.926935] mlx4_core 0000:05:00.7: Skipping virtual function:7

[ 9.934160] pci 0000:05:01.0: [15b3:1004] type 00 class 0x028000

[ 9.941145] pci 0000:05:01.0: Max Payload Size set to 256 (was 128, max 512)

[ 9.950497] iommu: Adding device 0000:05:01.0 to group 56

[ 9.957354] mlx4_core: Initializing 0000:05:01.0

[ 9.964295] mlx4_core 0000:05:01.0: enabling device (0000 → 0002)

[ 9.972187] mlx4_core 0000:05:01.0: Skipping virtual function:8

[ 9.979753] mlx4_core 0000:05:00.0: Running in master mode

[ 9.986885] mlx4_core 0000:05:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s

[ 9.994178] mlx4_core 0000:05:00.0: PCIe link width is x8, device supports x8

[ 10.178999] mlx4_core: Initializing 0000:05:00.1

[ 10.186463] mlx4_core 0000:05:00.1: enabling device (0000 → 0002)

[ 10.194814] mlx4_core 0000:05:00.1: Skipping virtual function:1

[ 10.202620] mlx4_core: Initializing 0000:05:00.2

[ 10.210054] mlx4_core 0000:05:00.2: enabling device (0000 → 0002)

[ 10.218256] mlx4_core 0000:05:00.2: Skipping virtual function:2

[ 10.225917] mlx4_core: Initializing 0000:05:00.3

[ 10.233189] mlx4_core 0000:05:00.3: enabling device (0000 → 0002)

[ 10.241256] mlx4_core 0000:05:00.3: Skipping virtual function:3

[ 10.248961] mlx4_core: Initializing 0000:05:00.4

[ 10.256108] mlx4_core 0000:05:00.4: enabling device (0000 → 0002)

[ 10.264085] mlx4_core 0000:05:00.4: Skipping virtual function:4

[ 10.271598] mlx4_core: Initializing 0000:05:00.5

[ 10.278675] mlx4_core 0000:05:00.5: enabling device (0000 → 0002)

continued…

[ 10.286606] mlx4_core 0000:05:00.5: Skipping virtual function:5

[ 10.294012] mlx4_core: Initializing 0000:05:00.6

[ 10.301002] mlx4_core 0000:05:00.6: enabling device (0000 → 0002)

[ 10.308782] mlx4_core 0000:05:00.6: Skipping virtual function:6

[ 10.316133] mlx4_core: Initializing 0000:05:00.7

[ 10.323060] mlx4_core 0000:05:00.7: enabling device (0000 → 0002)

[ 10.330772] mlx4_core 0000:05:00.7: Skipping virtual function:7

[ 10.338002] mlx4_core: Initializing 0000:05:01.0

[ 10.344802] mlx4_core 0000:05:01.0: enabling device (0000 → 0002)

[ 10.352404] mlx4_core 0000:05:01.0: Skipping virtual function:8

[ 10.366153] mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.2-1 (Feb 2014)

[ 23.574588] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014)

[ 23.585484] <mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0

[ 23.664067] mlx4_core 0000:05:00.0: mlx4_ib: multi-function enabled

[ 23.679324] mlx4_core 0000:05:00.0: mlx4_ib: initializing demux service for 128 qp1 clients

reply…

The issue appears to be that the module parameter need to be /etc/modprobe.d/mlx4_core.conf and not /etc/modprobe.d/mlx4.conf. After moving the parameters into the correct file the VFs are probed as expected on boot.

cat /etc/modprobe.d/mlx4_core.conf

options mlx4_core num_vfs=8 probe_vf=8 port_type_array=1

The issue appears to be that the module parameter need to be /etc/modprobe.d/mlx4_core.conf and not /etc/modprobe.d/mlx4.conf. After moving the parameters into the correct file the VFs are probed as expected on boot.

cat /etc/modprobe.d/mlx4_core.conf

options mlx4_core num_vfs=8 probe_vf=8 port_type_array=1