Cannot create VFs on one PF, but can create on another

On Host side ,I can create VFs on enp1s0f0 like

echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs

But it doesn’t work on enp1s0f1

echo 2 > /sys/class/net/enp1s0f1/device/sriov_numvfs

It returns

bash: echo: write error: Input/output error

I check the driver using

dpdk-devbind.py -s

and mlxconfig tool ,nothing different between two ports except some SF configuration

dmesg log shows below

mlx5_core 0000:01:00.1: mlx5_sriov_enable:186:(pid 6253): pci_enable_sriov failed : -5

Hi Waterzhu,

What OS/Kernel/MLNX_OFED/FW are running on the host?
What type of server?
SRIOV has been enabled on the second port via our mlxconfig utility, am I correct?
Is this server running the latest BIOS version?
Can you check if your BIOS ARI (Alternate Routing ID) setting is enabled?

Sophie.

Note:

Summary of Hardware Considerations for SR-IOV

Firmware (BIOS or UEFI) must support SR-IOV. Check if the extension is enabled by default. If not, enable it manually. This is similar to enabling the virtualization extension (VT-d or AMD-Vi). Refer to vendor manuals for specific details.

Root ports, or ports immediately upstream of the PCIe device (such as a PCIe switch), must support ARI.

PCIe device must support SR-IOV.

It’s Ubuntu 20.0,5.4.0-26-generic
super-micro x11
what’s I mean is One BF2 on the same Host , But One port can use SR-IOV on the host, but the other one cannot. I use the kernel echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs, not via mlxconifg.
I just use the mlxconfig to check sriov Flag is set.
I supposed that the Hardware is OK,because One port use SR-IOV success.
Or you mean if BIOS has problem, it is expressed as one port can create success and the other cannot

Hello all,

I’m experiencing exactly the same issue as @waterzhu :

$ uname -a
Linux 5.15.0-69-generic #76~20.04.1-Ubuntu SMP Mon Mar 20 15:54:19 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ sudo mlxfwmanager
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX6DX
  Part Number:      MCX623106AN-CDA_Ax
  Description:      ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16;
  PSID:             MT_0000000359
  PCI Device Name:  /dev/mst/mt4125_pciconf0
  Base GUID:        <redacted>
  Base MAC:         <redacted>
  Versions:         Current        Available     
     FW             22.31.1014     N/A           
     PXE            3.6.0403       N/A           
     UEFI           14.24.0013     N/A           

  Status:           No matching image found

$ sudo mlxconfig -d /dev/mst/mt4125_pciconf0 q | grep SRIOV
         SRIOV_EN                                    True(1)         
         SRIOV_IB_ROUTING_MODE_P1                    LID(1)          
         SRIOV_IB_ROUTING_MODE_P2                    LID(1)

Note that /dev/mst/ has two subdirectories, mt4125_pciconf0 and mt4125_pciconf0.1, but mlxfwmanager only return the former. Would it be the problem?

and get the following result:

$ echo 4 | sudo tee -a /sys/class/net/enp1s0f0/device/sriov_numvfs 
4
$ echo 4 | sudo tee -a /sys/class/net/enp1s0f1/device/sriov_numvfs 
4
tee: /sys/class/net/enp1s0f1/device/sriov_numvfs: Input/output error

And on port 0 the VFs are setup:

$ lspci | grep Mellanox
01:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
01:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
01:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
01:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
01:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
01:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function

Since VFs can be set up on port 0, i.e. enp1s0f0, I guess the settings are alright.

So why can’t enp1s0f1 set up its own VF?