I downloaded the BFB from the official NVIDIA website and reflashed it using the command below. The process completed without any errors. After a cold reboot, the system still fails to boot up, as shown in the log screenshot below. Could you please advise on what I should do to restore normal operation of the DPU BMC?
I reviewed the console logs again and found the following entries:
[15:28:39] 6: vlan4040@oob_net0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether e8:**:**:**:**:** brd ff:ff:ff:ff:ff:ff
inet6 fe80::****:****:fe85:568/64 scope link
valid_lft forever preferred_lft forever
[15:28:39] - ERROR: Failed to create VLAN interface after 311 sec. All the BMC related operations will be skipped.
[15:28:39] WARN Skipping BMC components upgrade.
[15:28:39] BFB-Installer: Installing BMC Image failed, total 58% complete
[15:28:39] BFB-Installer: Installing Glacier Image failed, total 59% complete
[15:28:39] BFB-Installer: Installing DPU Golden Image failed, total 76% complete
[15:28:39] BFB-Installer: Installing NIC FW Golden Image failed, total 94% complete
[15:28:42] INFO: Updating NIC firmware…
BlueField DPU, may load BlueField firmware.
Initializing…
Scanning /opt/mellanox/mlnx-fw-updater/firmware//mlxfwmanager_sriov_dis_aarch64_41686
Scanning /opt/mellanox/mlnx-fw-updater/firmware//mlxfwmanager_sriov_dis_aarch64_41692
Attempting to perform Firmware update…
This seems like an issue on the BMC side.
I suggest to first attempt reseat of the DPU and unplugging and replugging the relevant cabling.
Once done, please try pushing the bfb again with default settings, to see if it works.
If this still doesn’t work, please open a case with Enterprise Support and it will be handled based on the entitlement.