Why are you trying to bind the PF to vfio-pci? What is the use case?
Are you using MLNX_OFED driver versus mlx5 OS Inbox?
Are you able to assign a VF pcie passthrough or SRIOV using our mlx5 driver stack?
Are vfio* modules loaded (lsmod)?
Virtualization enabled via BIOS?
Is GRUB configured with iommu=on, iomm=pt?
Is the PCIe device correct? (lspci -Dk | grep -i mel)
Does the path exist? (ls -l /sys/bus/pci/drivers/vfio-pci)
Any hints in dmesg?
Why are you trying to bind the PF to vfio-pci? What is the use case?
Just like I said in this topic Using NEO-Host on vm with SR-IOV. I have tried to use NEO-HOST in a VM with a VF. I installed everything as far as I know. But NEO-HOST does not work. I checked the log and noticed that it uses MST to get device information. However, I cannot get anything with mst status. So I guess maybe it is the limitation of VF. That is why I’m trying to bind PF to vfio-pci so that I can passthrough PF to the VM.
Are you using MLNX_OFED driver versus mlx5 OS Inbox?
I am not totally sure this question. I have installed the MLNX_OFED successfully.
Are you able to assign a VF pcie passthrough or SRIOV using our mlx5 driver stack?
Are vfio* modules loaded (lsmod)?
Virtualization enabled via BIOS?
Is GRUB configured with iommu=on, iomm=pt?
Is the PCIe device correct? (lspci -Dk | grep -i mel)
Does the path exist? (ls -l /sys/bus/pci/drivers/vfio-pci)
The answers to all these questions are yes. As I have said, I have already passed through a VF to the VM successfully and ibv_*_pingpong works well on the VM.
Any hints in dmesg?
This is the output in dmesg when I unbind the device from mlx5_core driver.
[2242079.400456] mlx5_core 0000:1a:00.0: E-Switch: Disable: mode(LEGACY), nvfs(4), active vports(5)
[2242081.284288] mlx5_core 0000:1a:00.0: E-Switch: cleanup
[2242082.307516] mlx5_core 0000:1a:00.0: driver left SR-IOV enabled after remove
This is the output of vfio-pci operation:
$ sudo bash -c "echo 0000:1a:00.0 > /sys/bus/pci/drivers/vfio-pci/bind"
bash: line 1: echo: write error: No such device
NEO-Host and our MFT tool should be used on the baremetal server and not the VM (as designed and use).
PF should be left as is with default MLNX_OFED (mlx5_core) driver, SRIOV & PCie passthrough.
Thank you. That explains why I cannot configure these successfully. I would like to install the NEO-HOST in the VM because all of the servers’ OS in my lab are Ubuntu. While I only get the .rpm package from the vendor. May I ask if you have a NEO-HOST package for Debian distributions?
If you do not have support from Nvidia, the “vendor” as you mentioned that provided you the .rpm package should then have access to the Debian based package as well from Nvidia.
Thank you for your reply again. I personally do not have an account for this. But from this link NVIDIA Enterprise, I noticed that we can register for an enterprise account if we have a licensed product. We do have a lot in my lab. In that way, how can I register for such an account? Thank you.