Cannot create any VF on the second PF when setting NUM_OF_VFS >= 6

Hi experts, here I encountered a problem similar to Cannot create VFs on one PF, but can create on another

TL;DR

mstconfig -d 0000:85:00.0 set SRIOV_EN=1 NUM_OF_VFS=16
# reboot the server

# failed on the second PF
echo 1 > /sys/class/net/ens2f1np1/device/sriov_numvfs
-bash: echo: write error: Input/output error

# The error output of syslog
Jan 21 10:07:31 caas-c125 kernel: [ 1399.923145] pci 0000:85:02.2: [15b3:1018] type 7f class 0xffffff
Jan 21 10:07:31 caas-c125 kernel: [ 1399.923153] pci 0000:85:02.2: unknown header type 7f, ignoring device

Here is the detailed steps to reproduce and environment.

Steps to Reproduce

Bad case

I set NUM_OF_VFS=16 to FW. Then create 1 VF on each PF.

# Set NUM_OF_VFS to 16. I think it works to both ports
mstconfig -d 0000:85:00.0 set SRIOV_EN=1 NUM_OF_VFS=16
# reboot the server
# Create one VF on the first PF and works
echo 1 > /sys/class/net/ens2f0np0/device/sriov_numvfs

# But failed on the second PF
echo 1 > /sys/class/net/ens2f1np1/device/sriov_numvfs
-bash: echo: write error: Input/output error

root@caas-c125:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether a0:42:3f:46:f4:02 brd ff:ff:ff:ff:ff:ff
    altname enp9s0f0
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether a0:42:3f:46:f4:03 brd ff:ff:ff:ff:ff:ff
    altname enp9s0f1
4: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 16:9e:10:68:51:3a brd ff:ff:ff:ff:ff:ff permaddr e8:eb:d3:c1:54:dc
    vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    altname enp133s0f0np0
5: ens2f1np1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 16:9e:10:68:51:3a brd ff:ff:ff:ff:ff:ff permaddr e8:eb:d3:c1:54:dd
    altname enp133s0f1np1
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 16:9e:10:68:51:3a brd ff:ff:ff:ff:ff:ff
7: ens2f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether a6:fe:59:f6:dd:c0 brd ff:ff:ff:ff:ff:ff permaddr e6:92:42:60:17:ac
    altname enp133s0f0v0

Output of syslog:

### OUTPUT of creating VF on ens2f0np0 (successful)
Jan 21 10:06:46 caas-c125 kernel: [ 1355.842270] mlx5_core 0000:85:00.0: E-Switch: Enable: mode(LEGACY), nvfs(1), necvfs(0), active vports(2)
Jan 21 10:06:47 caas-c125 kernel: [ 1355.947206] pci 0000:85:00.2: [15b3:1018] type 00 class 0x020000
Jan 21 10:06:47 caas-c125 kernel: [ 1355.947294] pci 0000:85:00.2: enabling Extended Tags
Jan 21 10:06:47 caas-c125 kernel: [ 1355.948071] pci 0000:85:00.2: Adding to iommu group 15
Jan 21 10:06:47 caas-c125 kernel: [ 1355.948381] mlx5_core 0000:85:00.2: enabling device (0000 -> 0002)
Jan 21 10:06:47 caas-c125 kernel: [ 1355.948487] mlx5_core 0000:85:00.2: firmware version: 16.35.3502
Jan 21 10:06:47 caas-c125 kernel: [ 1356.109646] mlx5_core 0000:85:00.2: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
Jan 21 10:06:47 caas-c125 kernel: [ 1356.123313] mlx5_core 0000:85:00.2: Assigned random MAC address e6:92:42:60:17:ac
Jan 21 10:06:47 caas-c125 kernel: [ 1356.123565] mlx5_core 0000:85:00.2: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
Jan 21 10:06:47 caas-c125 systemd-udevd[5069]: Using default interface naming scheme 'v249'.
Jan 21 10:06:47 caas-c125 networkd-dispatcher[2296]: WARNING:Unknown index 7 seen, reloading interface list
Jan 21 10:06:47 caas-c125 kernel: [ 1356.253520] mlx5_core 0000:85:00.2 ens2f0v0: renamed from eth0
Jan 21 10:06:47 caas-c125 systemd-networkd[2252]: eth0: Interface name change detected, renamed to ens2f0v0.
Jan 21 10:06:47 caas-c125 systemd-udevd[5090]: Using default interface naming scheme 'v249'.


### OUTPUT of creating VF on ens2f1np1 (failed)
Jan 21 10:07:30 caas-c125 kernel: [ 1399.815921] mlx5_core 0000:85:00.1: E-Switch: Enable: mode(LEGACY), nvfs(1), necvfs(0), active vports(2)
Jan 21 10:07:31 caas-c125 kernel: [ 1399.923145] pci 0000:85:02.2: [15b3:1018] type 7f class 0xffffff
Jan 21 10:07:31 caas-c125 kernel: [ 1399.923153] pci 0000:85:02.2: unknown header type 7f, ignoring device
Jan 21 10:07:32 caas-c125 kernel: [ 1400.942993] mlx5_core 0000:85:00.1: mlx5_sriov_enable:230:(pid 4693): pci_enable_sriov failed : -5
Jan 21 10:07:32 caas-c125 kernel: [ 1400.943208] mlx5_core 0000:85:00.1: E-Switch: Unload vfs: mode(LEGACY), nvfs(1), necvfs(0), active vports(2)
Jan 21 10:07:32 caas-c125 kernel: [ 1400.959055] mlx5_core 0000:85:00.1: E-Switch: Disable: mode(LEGACY), nvfs(1), necvfs(0), active vports(1)

Good case

I set NUM_OF_VFS=5 on FW (which is the only difference from the bad case). Then create 1 VF on each PF.

# Set NUM_OF_VFS to 5. I think it works to both ports
mstconfig -d 0000:85:00.0 set SRIOV_EN=1 NUM_OF_VFS=5
# reboot the server
# Create one VF on the first PF and works
echo 1 > /sys/class/net/ens2f0np0/device/sriov_numvfs

# Also works on the second PF
echo 1 > /sys/class/net/ens2f1np1/device/sriov_numvfs

Output of syslog:

Jan 21 10:14:38 caas-c125 kernel: [   34.885775] mlx5_core 0000:85:00.0: E-Switch: Enable: mode(LEGACY), nvfs(1), necvfs(0), active vports(2)
Jan 21 10:14:38 caas-c125 kernel: [   34.991295] pci 0000:85:00.2: [15b3:1018] type 00 class 0x020000
Jan 21 10:14:38 caas-c125 kernel: [   34.991383] pci 0000:85:00.2: enabling Extended Tags
Jan 21 10:14:38 caas-c125 kernel: [   34.992185] pci 0000:85:00.2: Adding to iommu group 15
Jan 21 10:14:38 caas-c125 kernel: [   34.992560] mlx5_core 0000:85:00.2: enabling device (0000 -> 0002)
Jan 21 10:14:38 caas-c125 kernel: [   34.992657] mlx5_core 0000:85:00.2: firmware version: 16.35.3502
Jan 21 10:14:38 caas-c125 kernel: [   35.151678] mlx5_core 0000:85:00.2: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
Jan 21 10:14:38 caas-c125 kernel: [   35.165223] mlx5_core 0000:85:00.2: Assigned random MAC address 66:3c:f5:e4:c8:0e
Jan 21 10:14:38 caas-c125 kernel: [   35.165450] mlx5_core 0000:85:00.2: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
Jan 21 10:14:38 caas-c125 systemd-udevd[3097]: Using default interface naming scheme 'v249'.
Jan 21 10:14:38 caas-c125 networkd-dispatcher[2290]: WARNING:Unknown index 7 seen, reloading interface list
Jan 21 10:14:38 caas-c125 kernel: [   35.296370] mlx5_core 0000:85:00.2 ens2f0v0: renamed from eth0
Jan 21 10:14:38 caas-c125 systemd-networkd[2248]: eth0: Interface name change detected, renamed to ens2f0v0.
Jan 21 10:14:38 caas-c125 systemd-udevd[3130]: Using default interface naming scheme 'v249'.

###

Jan 21 10:14:46 caas-c125 kernel: [   42.903207] mlx5_core 0000:85:00.1: E-Switch: Enable: mode(LEGACY), nvfs(1), necvfs(0), active vports(2)
Jan 21 10:14:46 caas-c125 kernel: [   43.011192] pci 0000:85:00.7: [15b3:1018] type 00 class 0x020000
Jan 21 10:14:46 caas-c125 kernel: [   43.011285] pci 0000:85:00.7: enabling Extended Tags
Jan 21 10:14:46 caas-c125 kernel: [   43.012113] pci 0000:85:00.7: Adding to iommu group 15
Jan 21 10:14:46 caas-c125 kernel: [   43.012357] mlx5_core 0000:85:00.7: enabling device (0000 -> 0002)
Jan 21 10:14:46 caas-c125 kernel: [   43.012452] mlx5_core 0000:85:00.7: firmware version: 16.35.3502
Jan 21 10:14:46 caas-c125 kernel: [   43.179959] mlx5_core 0000:85:00.7: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
Jan 21 10:14:46 caas-c125 kernel: [   43.194179] mlx5_core 0000:85:00.7: Assigned random MAC address 2e:28:d8:a2:2e:55
Jan 21 10:14:46 caas-c125 kernel: [   43.645875] mlx5_core 0000:85:00.7: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
Jan 21 10:14:48 caas-c125 systemd-udevd[4854]: Using default interface naming scheme 'v249'.
Jan 21 10:14:48 caas-c125 networkd-dispatcher[2290]: WARNING:Unknown index 8 seen, reloading interface list
Jan 21 10:14:48 caas-c125 kernel: [   44.982858] mlx5_core 0000:85:00.7 ens2f1v0: renamed from eth0
Jan 21 10:14:48 caas-c125 systemd-networkd[2248]: eth0: Interface name change detected, renamed to ens2f1v0.
Jan 21 10:14:50 caas-c125 systemd[1]: systemd-timedated.service: Deactivated successfully.



Environment

  • CPU: AMD EPYC 7T83 64-Core Processor
  • Motherboard: https://www.tyan.com/ZH_Motherboards_S8036_S8036GM2NE
  • Memory: SAMSUNG 36ASF8G72PZ-3G2E1 64GB x 8
  • NIC: ConnectX-5 EN ; 10/25GbE dual-port SFP28; PCIe3.0 x8; Model: MCX512A-ACAT
  • With (12) NVMe SSDs, which may not related to the issue.

System:
Ubuntu 22.04 server
Driver version: installed mlnx-en-23.10-1.1.9.0-ubuntu22.04-x86_64

# uname -a
Linux caas-c125 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.0-72-generic root=UUID=8b8212f5-f6f8-443e-b31b-4416f82b2c6d ro systemd.unified_cgroup_hierarchy=0 intel_iommu=on amd_iommu=on iommu=pt default_hugepagesz=20MB hugepagesz=20MB nr_hugepages=512


# FW Version
----------

  Device Type:      ConnectX5
  Part Number:      MCX512A-ACA_Ax_Bx
  Description:      ConnectX-5 EN network interface card; 10/25GbE dual-port SFP28; PCIe3.0 x8; tall bracket; ROHS R6
  PSID:             MT_0000000080
  PCI Device Name:  85:00.0
  Base GUID:        e8ebd30300c154dc
  Base MAC:         e8ebd3c154dc
  Versions:         Current        Available
     FW             16.35.3502     16.35.3006
     PXE            3.6.0902       3.6.0902
     UEFI           14.29.0015     14.29.0015

  Status:           Up to date

# ARI is enabled (I guess)
root@caas-c125:~# cat /sys/class/net/ens2f1np1/device/ari_enabled
1
root@caas-c125:~# cat /sys/class/net/ens2f0np0/device/ari_enabled
1


root@caas-c125:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether a0:42:3f:46:f4:02 brd ff:ff:ff:ff:ff:ff
    altname enp9s0f0
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether a0:42:3f:46:f4:03 brd ff:ff:ff:ff:ff:ff
    altname enp9s0f1
4: **ens2f0np0**: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 16:9e:10:68:51:3a brd ff:ff:ff:ff:ff:ff permaddr e8:eb:d3:c1:54:dc
    altname enp133s0f0np0
5: **ens2f1np1**: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 16:9e:10:68:51:3a brd ff:ff:ff:ff:ff:ff permaddr e8:eb:d3:c1:54:dd
    altname enp133s0f1np1
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 16:9e:10:68:51:3a brd ff:ff:ff:ff:ff:ff



root@caas-c125:~# mstconfig query

Device #1:
----------

Device type:    ConnectX5
Name:           MCX512A-ACA_Ax_Bx
Description:    ConnectX-5 EN network interface card; 10/25GbE dual-port SFP28; PCIe3.0 x8; tall bracket; ROHS R6
Device:         /sys/bus/pci/devices/0000:85:00.0/config

Configurations:                              Next Boot
         MEMIC_BAR_SIZE                      0
         MEMIC_SIZE_LIMIT                    _256KB(1)
         HOST_CHAINING_MODE                  DISABLED(0)
         HOST_CHAINING_CACHE_DISABLE         False(0)
         HOST_CHAINING_DESCRIPTORS           Array[0..7]
         HOST_CHAINING_TOTAL_BUFFER_SIZE     Array[0..7]
         FLEX_PARSER_PROFILE_ENABLE          0
         FLEX_IPV4_OVER_VXLAN_PORT           0
         ROCE_NEXT_PROTOCOL                  254
         ESWITCH_HAIRPIN_DESCRIPTORS         Array[0..7]
         ESWITCH_HAIRPIN_TOT_BUFFER_SIZE     Array[0..7]
         PF_BAR2_SIZE                        0
         NON_PREFETCHABLE_PF_BAR             False(0)
         VF_VPD_ENABLE                       False(0)
         PER_PF_NUM_SF                       False(0)
         STRICT_VF_MSIX_NUM                  False(0)
         VF_NODNIC_ENABLE                    False(0)
         NUM_OF_VFS                          16
         PF_BAR2_ENABLE                      False(0)
         SRIOV_EN                            True(1)
         PF_LOG_BAR_SIZE                     6
         VF_LOG_BAR_SIZE                     0
         NUM_PF_MSIX                         63
         NUM_VF_MSIX                         11
         INT_LOG_MAX_PAYLOAD_SIZE            AUTOMATIC(0)
         PCIE_CREDIT_TOKEN_TIMEOUT           0
         ACCURATE_TX_SCHEDULER               False(0)
         PARTIAL_RESET_EN                    False(0)
         SW_RECOVERY_ON_ERRORS               False(0)
         RESET_WITH_HOST_ON_ERRORS           False(0)
         ADVANCED_POWER_SETTINGS             False(0)
         CQE_COMPRESSION                     BALANCED(0)
         IP_OVER_VXLAN_EN                    False(0)
         MKEY_BY_NAME                        False(0)
         ESWITCH_IPV4_TTL_MODIFY_ENABLE      False(0)
         PRIO_TAG_REQUIRED_EN                False(0)
         UCTX_EN                             True(1)
         PCI_ATOMIC_MODE                     PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0)
         TUNNEL_ECN_COPY_DISABLE             False(0)
         LRO_LOG_TIMEOUT0                    6
         LRO_LOG_TIMEOUT1                    7
         LRO_LOG_TIMEOUT2                    8
         LRO_LOG_TIMEOUT3                    13
         LOG_TX_PSN_WINDOW                   7
         LOG_MAX_OUTSTANDING_WQE             7
         TX_SCHEDULER_BURST                  0
         ZERO_TOUCH_TUNING_ENABLE            False(0)
         LOG_DCR_HASH_TABLE_SIZE             11
         DCR_LIFO_SIZE                       16384
         ROCE_CC_PRIO_MASK_P1                255
         ROCE_CC_PRIO_MASK_P2                255
         CLAMP_TGT_RATE_AFTER_TIME_INC_P1    True(1)
         CLAMP_TGT_RATE_P1                   False(0)
         RPG_TIME_RESET_P1                   300
         RPG_BYTE_RESET_P1                   32767
         RPG_THRESHOLD_P1                    1
         RPG_MAX_RATE_P1                     0
         RPG_AI_RATE_P1                      5
         RPG_HAI_RATE_P1                     50
         RPG_GD_P1                           11
         RPG_MIN_DEC_FAC_P1                  50
         RPG_MIN_RATE_P1                     1
         RATE_TO_SET_ON_FIRST_CNP_P1         0
         DCE_TCP_G_P1                        1019
         DCE_TCP_RTT_P1                      1
         RATE_REDUCE_MONITOR_PERIOD_P1       4
         INITIAL_ALPHA_VALUE_P1              1023
         MIN_TIME_BETWEEN_CNPS_P1            4
         CNP_802P_PRIO_P1                    6
         CNP_DSCP_P1                         48
         CLAMP_TGT_RATE_AFTER_TIME_INC_P2    True(1)
         CLAMP_TGT_RATE_P2                   False(0)
         RPG_TIME_RESET_P2                   300
         RPG_BYTE_RESET_P2                   32767
         RPG_THRESHOLD_P2                    1
         RPG_MAX_RATE_P2                     0
         RPG_AI_RATE_P2                      5
         RPG_HAI_RATE_P2                     50
         RPG_GD_P2                           11
         RPG_MIN_DEC_FAC_P2                  50
         RPG_MIN_RATE_P2                     1
         RATE_TO_SET_ON_FIRST_CNP_P2         0
         DCE_TCP_G_P2                        1019
         DCE_TCP_RTT_P2                      1
         RATE_REDUCE_MONITOR_PERIOD_P2       4
         INITIAL_ALPHA_VALUE_P2              1023
         MIN_TIME_BETWEEN_CNPS_P2            4
         CNP_802P_PRIO_P2                    6
         CNP_DSCP_P2                         48
         LLDP_NB_DCBX_P1                     False(0)
         LLDP_NB_RX_MODE_P1                  OFF(0)
         LLDP_NB_TX_MODE_P1                  OFF(0)
         LLDP_NB_DCBX_P2                     False(0)
         LLDP_NB_RX_MODE_P2                  OFF(0)
         LLDP_NB_TX_MODE_P2                  OFF(0)
         DCBX_IEEE_P1                        True(1)
         DCBX_CEE_P1                         True(1)
         DCBX_WILLING_P1                     True(1)
         DCBX_IEEE_P2                        True(1)
         DCBX_CEE_P2                         True(1)
         DCBX_WILLING_P2                     True(1)
         KEEP_ETH_LINK_UP_P1                 True(1)
         KEEP_IB_LINK_UP_P1                  False(0)
         KEEP_LINK_UP_ON_BOOT_P1             False(0)
         KEEP_LINK_UP_ON_STANDBY_P1          False(0)
         DO_NOT_CLEAR_PORT_STATS_P1          False(0)
         AUTO_POWER_SAVE_LINK_DOWN_P1        False(0)
         KEEP_ETH_LINK_UP_P2                 True(1)
         KEEP_IB_LINK_UP_P2                  False(0)
         KEEP_LINK_UP_ON_BOOT_P2             False(0)
         KEEP_LINK_UP_ON_STANDBY_P2          False(0)
         DO_NOT_CLEAR_PORT_STATS_P2          False(0)
         AUTO_POWER_SAVE_LINK_DOWN_P2        False(0)
         NUM_OF_VL_P1                        _4_VLs(3)
         NUM_OF_TC_P1                        _8_TCs(0)
         NUM_OF_PFC_P1                       8
         VL15_BUFFER_SIZE_P1                 0
         NUM_OF_VL_P2                        _4_VLs(3)
         NUM_OF_TC_P2                        _8_TCs(0)
         NUM_OF_PFC_P2                       8
         VL15_BUFFER_SIZE_P2                 0
         DUP_MAC_ACTION_P1                   LAST_CFG(0)
         UNKNOWN_UPLINK_MAC_FLOOD_P1         False(0)
         SRIOV_IB_ROUTING_MODE_P1            LID(1)
         IB_ROUTING_MODE_P1                  LID(1)
         DUP_MAC_ACTION_P2                   LAST_CFG(0)
         UNKNOWN_UPLINK_MAC_FLOOD_P2         False(0)
         SRIOV_IB_ROUTING_MODE_P2            LID(1)
         IB_ROUTING_MODE_P2                  LID(1)
         PF_TOTAL_SF                         0
         PF_SF_BAR_SIZE                      0
         PCI_WR_ORDERING                     per_mkey(0)
         MULTI_PORT_VHCA_EN                  False(0)
         PORT_OWNER                          True(1)
         ALLOW_RD_COUNTERS                   True(1)
         RENEG_ON_CHANGE                     True(1)
         TRACER_ENABLE                       True(1)
         IP_VER                              IPv4(0)
         UEFI_HII_EN                         True(1)
         BOOT_DBG_LOG                        False(0)
         UEFI_LOGS                           DISABLED(0)
         BOOT_INTERRUPT_DIS                  False(0)
         BOOT_LACP_DIS                       True(1)
         P2P_ORDERING_MODE                   DEVICE_DEFAULT(0)
         ATS_ENABLED                         False(0)
         DYNAMIC_VF_MSIX_TABLE               False(0)
         EXP_ROM_UEFI_x86_ENABLE             False(0)
         EXP_ROM_PXE_ENABLE                  True(1)
         ADVANCED_PCI_SETTINGS               False(0)
         SAFE_MODE_THRESHOLD                 10
         SAFE_MODE_ENABLE                    True(1)




# lspci -vvk
85:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
        Subsystem: Mellanox Technologies ConnectX®-5 EN network interface card, 10/25GbE dual-port SFP28, PCIe3.0 x8, tall bracket ; MCX512A-ACAT
        Physical Slot: 2
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 1816
        NUMA node: 1
        IOMMU group: 15
        Region 0: Memory at 20048000000 (64-bit, prefetchable) [size=64M]
        Expansion ROM at b6900000 [disabled] [size=1M]
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s (ok), Width x8 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn+
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [48] Vital Product Data
                Product Name: CX512A - ConnectX-5 SFP28
                Read-only fields:
                        [PN] Part number: MCX512A-ACAT
                        [EC] Engineering changes: B7
                        [V2] Vendor specific: MCX512A-ACAT
                        [SN] Serial number: MT2231419526
                        [V3] Vendor specific: badfbdaa5b08ed118000e8ebd3c154dc
                        [VA] Vendor specific: MLX:MODL=CX512A:MN=MLNX:CSKU=V2:UUID=V3:PCI=V0
                        [V0] Vendor specific: PCIeGen3 x8
                        [VU] Vendor specific: MT2231419526MLNXS0D0F0
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                End
        Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
                Vector table: BAR=0 offset=00002000
                PBA: BAR=0 offset=00003000
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO+ CmpltAbrt- UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                AERCap: First Error Pointer: 08, ECRCGenCap+ ECRCGenEn+ ECRCChkCap+ ECRCChkEn+
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 1
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 2, stride: 1, Device ID: 1018
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 0: Memory at 000002004d000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [1c0 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [230 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Kernel driver in use: mlx5_core
        Kernel modules: mlx5_core

85:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
        Subsystem: Mellanox Technologies ConnectX®-5 EN network interface card, 10/25GbE dual-port SFP28, PCIe3.0 x8, tall bracket ; MCX512A-ACAT
        Physical Slot: 2
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin B routed to IRQ 1881
        NUMA node: 1
        IOMMU group: 15
        Region 0: Memory at 20044000000 (64-bit, prefetchable) [size=64M]
        Expansion ROM at b6800000 [disabled] [size=1M]
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s (ok), Width x8 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn+
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
                         EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [48] Vital Product Data
                Product Name: CX512A - ConnectX-5 SFP28
                Read-only fields:
                        [PN] Part number: MCX512A-ACAT
                        [EC] Engineering changes: B7
                        [V2] Vendor specific: MCX512A-ACAT
                        [SN] Serial number: MT2231419526
                        [V3] Vendor specific: badfbdaa5b08ed118000e8ebd3c154dc
                        [VA] Vendor specific: MLX:MODL=CX512A:MN=MLNX:CSKU=V2:UUID=V3:PCI=V0
                        [V0] Vendor specific: PCIeGen3 x8
                        [VU] Vendor specific: MT2231419526MLNXS0D0F1
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                End
        Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
                Vector table: BAR=0 offset=00002000
                PBA: BAR=0 offset=00003000
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO+ CmpltAbrt- UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                AERCap: First Error Pointer: 08, ECRCGenCap+ ECRCGenEn+ ECRCChkCap+ ECRCChkEn+
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy-
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 01
                VF offset: 17, stride: 1, Device ID: 1018
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 0: Memory at 000002004c000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [230 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Kernel driver in use: mlx5_core
        Kernel modules: mlx5_core

Hi, the problem is resolved. The issue comes from motherboard/BIOS configuration (not solved yet).

I plug CX-5 NIC to another PC, everything works (127VF x 2).

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.