Can't Open the device after reset NUM_OF_VFS

I used mlxconfig to set NUM_OF_VFS to 127, as described in the OFED Document. However, When rebooted my machine, I can’t find the device using ip a

Here is the system information:

OS: Ubuntu 18.04
Kernel version: 4.15.0
device: ConnectX6DX(MCX623106AN-CDA_Ax)
OFED Version: OFED-5.8-3.0.7

lspci -v | grep nox

06:00.0 Ethernet controller: Mellanox Technologies MT28841
	Subsystem: Mellanox Technologies MT28841
06:00.1 Ethernet controller: Mellanox Technologies MT28841
	Subsystem: Mellanox Technologies MT28841

lsmod | grep nox

mlx5_ib               397312  0
ib_uverbs             135168  2 rdma_ucm,mlx5_ib
ib_core               339968  9 rdma_cm,ib_ipoib,iw_cm,ib_iser,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
mlx5_core            1540096  1 mlx5_ib
mlxdevm               172032  1 mlx5_core
auxiliary              16384  2 mlx5_ib,mlx5_core
mlxfw                  24576  1 mlx5_core
psample                20480  1 mlx5_core
devlink                45056  1 mlx5_core
mlx_compat             49152  13 rdma_cm,ib_ipoib,mlxdevm,iw_cm,auxiliary,ib_iser,ib_umad,ib_core,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,mlx5_core
ptp                    20480  4 igb,e1000e,mlx5_core,ixgbe

dmesg | grep mlx

[   10.602692] mlx5_core 0000:06:00.0: Missing registers BAR, aborting
[   10.602699] mlx5_core 0000:06:00.0: mlx5_pci_init:1125:(pid 366): error requesting BARs, aborting
[   10.604469] mlx5_core 0000:06:00.0: probe_one:2133:(pid 366): mlx5_pci_init failed with error code -19
[   10.604790] mlx5_core 0000:06:00.1: Missing registers BAR, aborting
[   10.604797] mlx5_core 0000:06:00.1: mlx5_pci_init:1125:(pid 366): error requesting BARs, aborting
[   10.607237] mlx5_core 0000:06:00.1: probe_one:2133:(pid 366): mlx5_pci_init failed with error code -19

It seems that the driver is not binded to the device, can you help me solve this problem Thanks!

By the way, I find that the max valid value of NUM_OF_VFS is 32. This is too far away from the 1K+ VFs advertised in the documentation. Is there any way to increase this value?

Hello,

Thank you for posting your inquiry on the NVIDIA Developer Forum - Infrastructure and Networking - Section.

Based on the information provided, this is not a NVIDIA adapter issue, but more a system issue.

  • Make sure that SR-IOV is enabled in the BIOS of the system
  • PCI Memory allocation → Kernel cannot allocate enough memory to bring up all VFs
    * Please add the following line to grub2 :
    * edit /etc/default/grub
    * add GRUB_CMDLINE_LINUX_DEFAULT=" pci=realloc=off"
    * update-grub
    * Reboot

Thank you and regards,
~NVIDIA Networking Technical Support

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.