Doca_devinfo_list_create() don't find devices

Hello,

I have a ubuntu computer installed one BlueFiled2 smartnic.I wrote my application using the DOCA DMA library function in DOCA, and before today, the program worked fine (copying data between Host and DPU).

After I reboot the smartnic,the application return ERROR.

I find that the reason is that the library function doca_devinfo_list_create(&dev_list, &nb_devs) returns nb_devs equal to 0.

The lspci results on Bluefiled2 are as follows:
ubuntu@localhost:~$ lspci
00:00.0 PCI bridge: Mellanox Technologies MT42822 BlueField-2 SoC Crypto enabled (rev 01)
01:00.0 PCI bridge: Mellanox Technologies MT42822 Family [BlueField-2 SoC PCIe Bridge] (rev 01)
02:00.0 PCI bridge: Mellanox Technologies MT42822 Family [BlueField-2 SoC PCIe Bridge] (rev 01)
03:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
03:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)

The results for lspci on Host are as follows:
86:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
86:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
86:00.2 Non-Volatile memory controller: Mellanox Technologies NVMe SNAP Controller
86:00.3 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)
87:00.0 Non-Volatile memory controller: Mellanox Technologies NVMe SNAP Controller
88:00.0 Non-Volatile memory controller: Mellanox Technologies NVMe SNAP Controller

I am facing exactly the same issue with my BlueField 2. I am using doca version 1.5.0. It was working without any problem a few months before.

I was able to fix the issue. It turned out my DPU was set to NIC mode instead of DPU mode. Check the configurations for your DPU using mlxconfig tool:

$ sudo mlxconfig -d /dev/mst/mt41686_pciconf0 q

Specifically check these configurations:

         INTERNAL_CPU_MODEL                          EMBEDDED_CPU(1)
         INTERNAL_CPU_PAGE_SUPPLIER                  ECPF(0)
         INTERNAL_CPU_ESWITCH_MANAGER                ECPF(0)
         INTERNAL_CPU_IB_VPORT0                      ECPF(0)
         INTERNAL_CPU_OFFLOAD_ENGINE                 ENABLED(0)

Hello @qq2196651959,

Thank you for posting your query on our community.

In order to debug this issue, we would request more details on the exact error seen and the process followed.
Has changing to DPU mode resolved the issue? If not, we would request you to submit a support ticket for further troubleshooting. The support ticket can be opened by emailing " Networking-support@nvidia.com ". Please note that an active support contract would be required for the same. For contracts information, please feel free to reach out to our contracts team at “Networking-Contracts@nvidia.com”.

Thanks,
Bhargavi