How to update BlueField-2 DPU with the DOCA 2.2.0 firmware image

This technote describes steps to update a BlueField-2 DPU with the DOCA 2.2.0 firmware image. If you have a BF2-DPU running an older version (e.g. DOCA 1.2.0) and attempted to update the firmware with sdkmanager, you might end-up bricking the DPU. The newer releases require the BlueField mode to be set in the UEFI BIOS, and disable secure boot in order to flash the latest image correctly to an older BF2 DPU.

NVIDIA BlueField-2 DPU VPI - Ethernet Setup Ubuntu-22.04

Overview

This document describes how to setup an NVIDIA BlueField-2 DPU VPI network adapter and configure its ports for Ethernet.

Important:
DOCA 2.2.0 only supports NVIDIA® BlueField®-2 DPUs. For BlueField-3, please refer or upgrade to DOCA 2.2.1.

Procedure

Step 01.00: Configure host computer system.

Step 01.01: BIOS settings.

Configure the following BIOS settings for an Intel 12th Gen Z690-chipset based host computer system motherboard:

CSM support: Enabled.

PCI Express Slew Rate: Fast

Step 02.00: Install DPU hardware into a computer system.

Step 02.01: Install DPU hardware into a PCIe slot.

Install the DPU adapter into a spare PCIe slot.

Step 02.02: Verify DPU hardware detection.

Boot the system and check if the DPU is recognized:

lspci |grep -i nox

01:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
01:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
01:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)

Step 03.00: Install DPU drivers.

You can download the latest DOCA drivers from the following website:

DOCA_WEBSITE: https://developer.nvidia.com/networking/doca

For instructions on how to install DOCA for BlueField-2 DPUs, see section Installing DOCA on BlueField-2 DPU

To install a new DOCA Runtime package, you must first remove the existing package.
For details, please refer to setcion “Upgrading BlueField DPU OS on Host Side” in the BlueField User Manual
(https://docs.mellanox.com/category/bluefieldsw).

sudo -s

# remove all doca packages
for f in $( dpkg --list | grep doca | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done

sudo apt autoremove

Step 03.01: Install DOCA runtime, sdk and tools.

export DOCA_URL=https://www.mellanox.com/downloads/DOCA/DOCA_v2.2.0
export DOCA_INSTALLER=doca-host-repo-ubuntu2204_2.2.0-0.0.3.2.2.0080.1.23.07.0.5.0.0_amd64.deb

# download
wget $DOCA_URL/$DOCA_INSTALLER

sudo dpkg -i $DOCA_INSTALLER
sudo apt update
sudo apt dist-upgrade

sudo apt install doca-runtime
sudo apt install doca-tools
sudo apt install doca-sdk

# install dpdk-doc
sudo apt install mlnx-dpdk-doc

# optional packages: rdnssd linux-image-generic openvswitch-datapath-module

# reboot the system

To remove the doca packages:

dpkg -r $(dpkg -f $DOCA_INSTALLER Package)

# example
sudo dpkg -r doca-host-repo-ubuntu2204

If you run into issues with broken packages, install synaptic to fix broken packages:

sudo apt install synaptic

# select a broken package and mark it for removal

Step 04.00: Configure network interfaces.

Step 04.01: Change Mellanox VPI ports from Infiniband to Ethernet

Mellanox makes three main types of cards: Ethernet only, Infiniband only, and VPI cards capable of both. You need the VPI versions and you may need to check a model number against a spec sheet to ensure you have a VPI capable card.

Swapping from Infiniband to Ethernet or back on a Mellanox ConnectX or DPU VPI card is simple.

First, we see what devices are installed.

# you must be root to use mst tool
sudo -s

# determine device-id
mst start
mst status -v

MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE             MST                           PCI       RDMA            NET                                     NUMA  
BlueField2(rev:1)       /dev/mst/mt41686_pciconf0.1   01:00.1   mlx5_1          net-enp1s0f1np1                         -1    

BlueField2(rev:1)       /dev/mst/mt41686_pciconf0     01:00.0   mlx5_0          net-enp1s0f0np0                         -1   

The device IDs for the BlueField-2 in this example is /dev/mst/mt41686_pciconf0

Display device configuration information:

# help
sudo mlxconfig -h

Examples:
    To query configurations            : mlxconfig -d /dev/mst/mt4099_pciconf0 query
    To set configuration               : mlxconfig -d /dev/mst/mt4099_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=16 WOL_MAGIC_EN_P1=1
    To set raw configuration           : mlxconfig -d /dev/mst/mt4115_pciconf0 -f conf_file set_raw
    To reset configuration             : mlxconfig -d /dev/mst/mt4099_pciconf0 reset

# display informations about all configurations
sudo mlxconfig -d 01:00.0 i

# query device configuration
sudo mlxconfig -d 01:00.0 query

Device #1:
----------

Device type:    BlueField2
Name:           MBF2M516A-EECO_Ax_Bx
Description:    BlueField-2 E-Series DPU 100GbE/EDR/HDR100 VPI Dual-Port QSFP56; PCIe Gen4 x16; Crypto and Secure Boot Enabled; 16GB on-board DDR; 1GbE OOB management; FHHL
Device:         01:00.0

Configurations:                       Next Boot
  MEMIC_BAR_SIZE                      0
  MEMIC_SIZE_LIMIT                    _256KB(1)
  HOST_CHAINING_MODE                  DISABLED(0)
  HOST_CHAINING_CACHE_DISABLE         False(0)
  HOST_CHAINING_DESCRIPTORS           Array[0..7]
  HOST_CHAINING_TOTAL_BUFFER_SIZE     Array[0..7]
  INTERNAL_CPU_MODEL                  EMBEDDED_CPU(1)
  FLEX_PARSER_PROFILE_ENABLE          0
  FLEX_IPV4_OVER_VXLAN_PORT           0
  ROCE_NEXT_PROTOCOL                  254
  ESWITCH_HAIRPIN_DESCRIPTORS         Array[0..7]
  ESWITCH_HAIRPIN_TOT_BUFFER_SIZE     Array[0..7]
  PF_BAR2_SIZE                        3
  PF_NUM_OF_VF_VALID                  False(0)
  NON_PREFETCHABLE_PF_BAR             False(0)
  VF_VPD_ENABLE                       False(0)
  PF_NUM_PF_MSIX_VALID                False(0)
  PER_PF_NUM_SF                       False(0)
  STRICT_VF_MSIX_NUM                  False(0)
  VF_NODNIC_ENABLE                    False(0)
  NUM_PF_MSIX_VALID                   True(1)
  NUM_OF_VFS                          16
  NUM_OF_PF                           2
  PF_BAR2_ENABLE                      True(1)
  HIDE_PORT2_PF                       False(0)
  SRIOV_EN                            True(1)
  PF_LOG_BAR_SIZE                     5
  VF_LOG_BAR_SIZE                     1
  NUM_PF_MSIX                         63
  NUM_VF_MSIX                         11
  INT_LOG_MAX_PAYLOAD_SIZE            AUTOMATIC(0)
  PCIE_CREDIT_TOKEN_TIMEOUT           0
  LAG_RESOURCE_ALLOCATION             DEVICE_DEFAULT(0)
  PHY_COUNT_LINK_UP_DELAY             DELAY_NONE(0)
  ACCURATE_TX_SCHEDULER               False(0)
  PARTIAL_RESET_EN                    False(0)
  RESET_WITH_HOST_ON_ERRORS           False(0)
  NVME_EMULATION_ENABLE               False(0)
  NVME_EMULATION_NUM_VF               0
  NVME_EMULATION_NUM_PF               1
  NVME_EMULATION_VENDOR_ID            5555
  NVME_EMULATION_DEVICE_ID            24577
  NVME_EMULATION_CLASS_CODE           67586
  NVME_EMULATION_REVISION_ID          0
  NVME_EMULATION_SUBSYSTEM_VENDOR_ID  0
  NVME_EMULATION_SUBSYSTEM_ID         0
  NVME_EMULATION_NUM_MSIX             0
  PCI_SWITCH_EMULATION_NUM_PORT       0
  PCI_SWITCH_EMULATION_ENABLE         False(0)
  VIRTIO_NET_EMULATION_ENABLE         False(0)
  VIRTIO_NET_EMULATION_NUM_VF         0
  VIRTIO_NET_EMULATION_NUM_PF         0
  VIRTIO_NET_EMU_SUBSYSTEM_VENDOR_ID  6900
  VIRTIO_NET_EMULATION_SUBSYSTEM_ID   1
  VIRTIO_NET_EMULATION_NUM_MSIX       2
  VIRTIO_BLK_EMULATION_ENABLE         False(0)
  VIRTIO_BLK_EMULATION_NUM_VF         0
  VIRTIO_BLK_EMULATION_NUM_PF         0
  VIRTIO_BLK_EMU_SUBSYSTEM_VENDOR_ID  6900
  VIRTIO_BLK_EMULATION_SUBSYSTEM_ID   2
  VIRTIO_BLK_EMULATION_NUM_MSIX       2
  PCI_DOWNSTREAM_PORT_OWNER           Array[0..15]
  CQE_COMPRESSION                     BALANCED(0)
  IP_OVER_VXLAN_EN                    False(0)
  MKEY_BY_NAME                        False(0)
  PRIO_TAG_REQUIRED_EN                False(0)
  UCTX_EN                             True(1)
  REAL_TIME_CLOCK_ENABLE              False(0)
  RDMA_SELECTIVE_REPEAT_EN            False(0)
  PCI_ATOMIC_MODE                     PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0)
  TUNNEL_ECN_COPY_DISABLE             False(0)
  LRO_LOG_TIMEOUT0                    6
  LRO_LOG_TIMEOUT1                    7
  LRO_LOG_TIMEOUT2                    8
  LRO_LOG_TIMEOUT3                    13
  LOG_TX_PSN_WINDOW                   7
  LOG_MAX_OUTSTANDING_WQE             7
  TUNNEL_IP_PROTO_ENTROPY_DISABLE     False(0)
  ICM_CACHE_MODE                      DEVICE_DEFAULT(0)
  TLS_OPTIMIZE                        False(0)
  TX_SCHEDULER_BURST                  0
  ZERO_TOUCH_TUNING_ENABLE            False(0)
  ROCE_CC_LEGACY_DCQCN                True(1)
  LOG_DCR_HASH_TABLE_SIZE             11
  DCR_LIFO_SIZE                       16384
  LINK_TYPE_P1                        IB(1)
  LINK_TYPE_P2                        IB(1)
  ROCE_CC_PRIO_MASK_P1                255
  ROCE_CC_PRIO_MASK_P2                255
  CLAMP_TGT_RATE_AFTER_TIME_INC_P1    True(1)
  CLAMP_TGT_RATE_P1                   False(0)
  RPG_TIME_RESET_P1                   300
  RPG_BYTE_RESET_P1                   32767
  RPG_THRESHOLD_P1                    1
  RPG_MAX_RATE_P1                     0
  RPG_AI_RATE_P1                      5
  RPG_HAI_RATE_P1                     50
  RPG_GD_P1                           11
  RPG_MIN_DEC_FAC_P1                  50
  RPG_MIN_RATE_P1                     1
  RATE_TO_SET_ON_FIRST_CNP_P1         0
  DCE_TCP_G_P1                        1019
  DCE_TCP_RTT_P1                      1
  RATE_REDUCE_MONITOR_PERIOD_P1       4
  INITIAL_ALPHA_VALUE_P1              1023
  MIN_TIME_BETWEEN_CNPS_P1            4
  CNP_802P_PRIO_P1                    6
  CNP_DSCP_P1                         48
  CLAMP_TGT_RATE_AFTER_TIME_INC_P2    True(1)
  CLAMP_TGT_RATE_P2                   False(0)
  RPG_TIME_RESET_P2                   300
  RPG_BYTE_RESET_P2                   32767
  RPG_THRESHOLD_P2                    1
  RPG_MAX_RATE_P2                     0
  RPG_AI_RATE_P2                      5
  RPG_HAI_RATE_P2                     50
  RPG_GD_P2                           11
  RPG_MIN_DEC_FAC_P2                  50
  RPG_MIN_RATE_P2                     1
  RATE_TO_SET_ON_FIRST_CNP_P2         0
  DCE_TCP_G_P2                        1019
  DCE_TCP_RTT_P2                      1
  RATE_REDUCE_MONITOR_PERIOD_P2       4
  INITIAL_ALPHA_VALUE_P2              1023
  MIN_TIME_BETWEEN_CNPS_P2            4
  CNP_802P_PRIO_P2                    6
  CNP_DSCP_P2                         48
  LLDP_NB_DCBX_P1                     False(0)
  LLDP_NB_RX_MODE_P1                  OFF(0)
  LLDP_NB_TX_MODE_P1                  OFF(0)
  LLDP_NB_DCBX_P2                     False(0)
  LLDP_NB_RX_MODE_P2                  OFF(0)
  LLDP_NB_TX_MODE_P2                  OFF(0)
  DCBX_IEEE_P1                        True(1)
  DCBX_CEE_P1                         True(1)
  DCBX_WILLING_P1                     True(1)
  DCBX_IEEE_P2                        True(1)
  DCBX_CEE_P2                         True(1)
  DCBX_WILLING_P2                     True(1)
  KEEP_ETH_LINK_UP_P1                 True(1)
  KEEP_IB_LINK_UP_P1                  False(0)
  KEEP_LINK_UP_ON_BOOT_P1             False(0)
  KEEP_LINK_UP_ON_STANDBY_P1          False(0)
  DO_NOT_CLEAR_PORT_STATS_P1          False(0)
  AUTO_POWER_SAVE_LINK_DOWN_P1        False(0)
  KEEP_ETH_LINK_UP_P2                 True(1)
  KEEP_IB_LINK_UP_P2                  False(0)
  KEEP_LINK_UP_ON_BOOT_P2             False(0)
  KEEP_LINK_UP_ON_STANDBY_P2          False(0)
  DO_NOT_CLEAR_PORT_STATS_P2          False(0)
  AUTO_POWER_SAVE_LINK_DOWN_P2        False(0)
  NUM_OF_VL_P1                        _4_VLs(3)
  NUM_OF_TC_P1                        _8_TCs(0)
  NUM_OF_PFC_P1                       8
  VL15_BUFFER_SIZE_P1                 0
  NUM_OF_VL_P2                        _4_VLs(3)
  NUM_OF_TC_P2                        _8_TCs(0)
  NUM_OF_PFC_P2                       8
  VL15_BUFFER_SIZE_P2                 0
  DUP_MAC_ACTION_P1                   LAST_CFG(0)
  MPFS_MC_LOOPBACK_DISABLE_P1         False(0)
  MPFS_UC_LOOPBACK_DISABLE_P1         False(0)
  UNKNOWN_UPLINK_MAC_FLOOD_P1         False(0)
  SRIOV_IB_ROUTING_MODE_P1            LID(1)
  IB_ROUTING_MODE_P1                  LID(1)
  DUP_MAC_ACTION_P2                   LAST_CFG(0)
  MPFS_MC_LOOPBACK_DISABLE_P2         False(0)
  MPFS_UC_LOOPBACK_DISABLE_P2         False(0)
  UNKNOWN_UPLINK_MAC_FLOOD_P2         False(0)
  SRIOV_IB_ROUTING_MODE_P2            LID(1)
  IB_ROUTING_MODE_P2                  LID(1)
  PF_TOTAL_SF                         0
  PF_SF_BAR_SIZE                      0
  PF_NUM_PF_MSIX                      63
  ROCE_CONTROL                        ROCE_ENABLE(2)
  PCI_WR_ORDERING                     per_mkey(0)
  MULTI_PORT_VHCA_EN                  False(0)
  PORT_OWNER                          True(1)
  ALLOW_RD_COUNTERS                   True(1)
  RENEG_ON_CHANGE                     True(1)
  TRACER_ENABLE                       True(1)
  IP_VER                              IPv4(0)
  BOOT_UNDI_NETWORK_WAIT              0
  UEFI_HII_EN                         True(1)
  BOOT_DBG_LOG                        False(0)
  UEFI_LOGS                           DISABLED(0)
  BOOT_VLAN                           1
  LEGACY_BOOT_PROTOCOL                PXE(1)
  BOOT_RETRY_CNT                      NONE(0)
  BOOT_INTERRUPT_DIS                  False(0)
  BOOT_LACP_DIS                       True(1)
  BOOT_VLAN_EN                        False(0)
  BOOT_PKEY                           0
  P2P_ORDERING_MODE                   DEVICE_DEFAULT(0)
  EXP_ROM_VIRTIO_NET_PXE_ENABLE       True(1)
  EXP_ROM_VIRTIO_NET_UEFI_x86_ENABLE  True(1)
  EXP_ROM_VIRTIO_BLK_UEFI_x86_ENABLE  True(1)
  EXP_ROM_NVME_UEFI_x86_ENABLE        True(1)
  ATS_ENABLED                         False(0)
  EXP_ROM_UEFI_ARM_ENABLE             True(1)
  EXP_ROM_UEFI_x86_ENABLE             True(1)
  EXP_ROM_PXE_ENABLE                  True(1)
  ADVANCED_PCI_SETTINGS               False(0)
  SAFE_MODE_THRESHOLD                 10
  SAFE_MODE_ENABLE                    True(1)

Here, we notice that the link type for both ports are set to InfiniBand.

  LINK_TYPE_P1                        IB(1)
  LINK_TYPE_P2                        IB(1)

We can then select the port configuration. Here, 1 is for InfiniBand and 2 is for Ethernet:

sudo mlxconfig -d 01:00.0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2

Device #1:
----------

Device type:    BlueField2
Name:           MBF2M516A-EECO_Ax_Bx
Description:    BlueField-2 E-Series DPU 100GbE/EDR/HDR100 VPI Dual-Port QSFP56; PCIe Gen4 x16; Crypto and Secure Boot Enabled; 16GB on-board DDR; 1GbE OOB management; FHHL
Device:         01:00.0

Configurations:                              Next Boot       New
         LINK_TYPE_P1                        IB(1)           ETH(2)
         LINK_TYPE_P2                        IB(1)           ETH(2)

 Apply new Configuration? (y/n) [n] : y

Applying... Done!
-I- Please reboot machine to load new configurations.

Reboot the system for the changes to take effect.

To check which mode a SmartNIC is running on, use the following command in the host:

$ mst start
$ mst status -v # identify the MST device
$ mlxconfig -d /dev/mst/mt41686_pciconf0 q | grep -i internal_cpu_model
INTERNAL_CPU_MODEL                  EMBEDDED_CPU(1)

Step 04.02: Configure IP address for network interfaces.

sudo -s
cd /etc/netplan
nano bf_config.yaml

Edit the file by typing nano bf_config.yaml:

# This file describes the network interfaces available on your system
# For more information, see netplan(5).
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
  version: 2
  renderer: networkd
  ethernets:
    tmfifo_net0:
      addresses: [192.168.100.1/24]
      dhcp4: no
	enp1s0f0np0:
      mtu: 9000
      addresses: [192.168.200.80/24]
      dhcp4: no
    enp1s0f1np1:
      mtu: 9000
      addresses: [192.168.200.81/24]
      dhcp4: no

Modify the existing 01-network-manager-all.yaml

# Let NetworkManager manage all devices on this system
network:
  version: 2
  renderer: NetworkManager

to the following configuration for eth0 and eth1 :

# Let NetworkManager manage all devices on this system
network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses: [192.168.200.127/24]
    eth1:
      dhcp4: yes
      nameservers:
        addresses: [192.168.1.10]

Restart the network interface:

sudo netplan apply

Verify that Rshim is active:

sudo systemctl status rshim

â—Ź rshim.service - rshim driver for BlueField SoC
     Loaded: loaded (/lib/systemd/system/rshim.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2023-11-05 09:35:40 +04; 7min ago
       Docs: man:rshim(8)
   Main PID: 12153 (rshim)
      Tasks: 7 (limit: 154244)
     Memory: 1.4M
        CPU: 11.663s
     CGroup: /system.slice/rshim.service
             └─12153 /usr/sbin/rshim

Nov 05 09:35:40 zeus rshim[12153]: Probing pcie-0000:01:00.2(vfio)
Nov 05 09:35:40 zeus rshim[12153]: Create rshim pcie-0000:01:00.2
Nov 05 09:35:40 zeus rshim[12153]: Failed to enable INTx
Nov 05 09:35:40 zeus rshim[12153]: rshim pcie-0000:01:00.2 enable
Nov 05 09:35:41 zeus rshim[12153]: rshim0 attached
Nov 05 09:35:41 zeus rshim[12153]: USB device detected
Nov 05 09:35:45 zeus rshim[12153]: Probing usb-1-b.3
Nov 05 09:35:45 zeus rshim[12153]: create rshim usb-1-b.3
Nov 05 09:35:45 zeus rshim[12153]: another backend already attached
Nov 05 09:35:45 zeus rshim[12153]: rshim usb-1-b.3 deleted

This command is expected to display active (running). If RShim service does not launch automatically, run:

sudo systemctl enable rshim
sudo systemctl start rshim 

Step 04.03: Configure VPI Ethernet interface

Assign an IP address and bring the ethernet network interfaces up:

sudo ifconfig enp2s0f0 192.168.253.80 netmask 255.255.255.0 up
sudo ifconfig enp2s0f1 192.168.253.81 netmask 255.255.255.0 up

Check the network link status:

ibstatus
Infiniband device 'mlx5_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:0ac0:ebff:fe88:1924
	base lid:	 0x0
	sm lid:		 0x0
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 100 Gb/sec (4X EDR)
	link_layer:	 Ethernet

Infiniband device 'mlx5_1' port 1 status:
	default gid:	 fe80:0000:0000:0000:0ac0:ebff:fe88:1925
	base lid:	 0x0
	sm lid:		 0x0
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 100 Gb/sec (4X EDR)
	link_layer:	 Ethernet

Step 04.04: Ensure that the BlueField Mode is correctly set in the UEFI configuration.

Reset the DPU

sudo -s

echo SW_RESET 1 > /dev/rshim0/misc

You can monitor the progress of the installation by typing the following command in a separate terminal:

# to interact with the console
sudo minicom --color on --baudrate 115200 --device /dev/rshim0/console

# to monitor only
cat /dev/rshim0/console 115200

Interrupt the boot process by pressing the ESC key twice and logging into the UEFI menu. The default password will be bluefield. Change it to a different password when logging into the UEFI settings the first time.

After that, ensure that you set the BlueField Mode to a valid setting:

Internal CPU Model: <Embedded>
Host Privilege Level: <Privileged>

Please note that Unavailable is an invalid setting and the BlueField-2 DPU wil not boot.

You will only be able to update the BlueField-2 DPU image if the BlueField Mode is set to a valid configuration.


Step 05.00: Update the BlueField-2 DPU image.

Step 05.01: Download the NVIDIA SDK Manager on the host computer.

Download the NVIDIA SDK Manager and install it on the host computer.

sudo apt install ./sdkmanager_2.0.0-11402_amd64.deb

Step 05.02: Use the NVIDIA SDK Manager to update the BlueField-2 DPU image.

Launch the SDK Manager

sdkmanager

Install the DOCA SDK with the SDK Manager

When SDK manager prompts you to flash the BlueField-2 InfiniBand/VPI DPU, set the initial user details for the DOCA OS:

Username: ubuntu
New Password: ubuntu

Step 05.03: Manually update the BlueField-2 DPU image.

Ubuntu users are prompted to change the default password (ubuntu) for the default user (ubuntu) upon first login. Logging in will not be possible even if the login prompt appears until all services are up (“DPU is ready” message appears in /dev/rshim0/misc).

Ubuntu users can provide a unique password that will be applied at the end of the BFB installation by creating a bf.cfg file which changes the default credentials. The password for the ubuntu user will be defined in the bf.cfg configuration file.

Create password hash. Run:

# openssl passwd -1
Password:
Verifying - Password:
$1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1

Add the password hash in quotes to the bf.cfg file:

nano bf.cfg
ubuntu_PASSWORD='$1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1'

If the Target OS image fails to update automatically using the SDK manager, type the following command to update it manually.

The bf.cfg file will be used with the bfb-install script in the following step.

Install the BFB.

Installing the BFB does not update the firmware. To do that, refer to section Firmware Upgrade.

sudo -s
bfb-install --bfb ./DOCA_2.2.0_BSP_4.2.0_Ubuntu_22.04-2.23-07.prod.bfb --config bf.cfg --rshim rshim0

You should see the following message on the console. It will take approximately 15 minutes to write the new firmware image:

Pushing bfb
1.11GiB 0:01:34 [12.0MiB/s] [                                                                                             <=>                                                 ]
Collecting BlueField booting status. Press Ctrl+C to stop…
 INFO[BL2]: start
 INFO[BL2]: DDR POST passed
 INFO[BL2]: UEFI loaded
 INFO[BL31]: start
 INFO[BL31]: lifecycle GA Secured
 INFO[BL31]: runtime
 INFO[UEFI]: UPVS valid
 WARN[UEFI]: UPVS full
 WARN[UEFI]: UPVS reclaim
 INFO[UEFI]: eMMC init
 INFO[UEFI]: eMMC probed
 WARN[UEFI]: Var reclaim
 WARN[UEFI]: Var reclaim done
 INFO[UEFI]: PMI: updates started
 INFO[UEFI]: PMI: total updates: 1
 INFO[UEFI]: PMI: updates completed, status 0
 INFO[UEFI]: PCIe enum start
 INFO[UEFI]: PCIe enum end
 INFO[UEFI]: exit Boot Service
 INFO[MISC]: Ubuntu installation started
 INFO[MISC]: Installing OS image
 INFO[MISC]: Installation finished

Monitor the progress of the firmware update by connecting to the DPU console.

You should see the following messages, if the DPU is being programmed correctly:

Last login: Tue Nov  7 02:48:40 UTC 2023 on hvc0
ubuntu@localhost:~$ echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
nameserver 8.8.8.8
[PMI] Boot image update started.
Trusted Board Boot is enabled (GA)
Note: Installed image will be filtered from 5031688 bytes to 3155832 bytes
Size check good, 3155832 <= 33030144
Verify Trusted Board Boot FW certificate.
Check boot image, Status: Success
...Filtering unneeded executables...
...Preparing boot image...
...Generating boot image...
...Writing the boot image, Size: 3155968 bytes...
...Verify boot image...
Write boot image to partition 1, Status: Success
...Filtering unneeded executables...
...Preparing boot image...
...Generating boot image...
...Writing the boot image, Size: 3155968 bytes...
...Verify boot image...
Write boot image to partition 2, Status: Success
[PMI] Boot Image update completed, Status: Success
[PMI] Total number of updates: 1
[PMI] Errors during updates  : 0
 DHCP Session Start 

Press ESC/F2/DEL twice    to enter UEFI Menu.
Press ENTER               to skip countdown.

3  seconds remain...
2  seconds remain...
1  seconds remain...
0  seconds remain...
EFI stub: Booting Linux Kernel...
EFI stub: Generating empty DTB
EFI stub: Loaded initrd from command line option
EFI stub: Exiting boot services[    8.760619] mlxbf2_gpio MLNXBF22:01: IRQ index 0 not found
[    8.782585] Synopsys Designware Multimedia Card Interface Driver
[    8.782583] mlxbf2_gpio MLNXBF22:02: IRQ index 0 not found
[    8.784488] mlxfw: loading out-of-tree module taints kernel.
[    8.784577] mlxfw: loading out-of-tree module taints kernel.
[    8.828809] dw_mmc PRP0001:00: IDMAC supports 64-bit address mode.
[    8.834332] Micrel KSZ9031 Gigabit PHY MLNXBF17:00:03: attached PHY driver (mii_bus:phy_addr=MLNXBF17:00:03, irq=74)
[    8.841304] dw_mmc PRP0001:00: Using internal DMA controller.
[    8.874029] dw_mmc PRP0001:00: Version ID is 270a
[    8.874366] Compat-mlnx-ofed backport release: 4e13edc
[    8.883525] dw_mmc PRP0001:00: DW MMC controller at irq 22,32 bit host data width,256 deep fifo
[    8.893814] Backport based on mlnx_ofed/mlnx-ofa_kernel-4.0.git 4e13edc
[    8.923088] mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 400000Hz, actual 396825HZ div = 63)
[    8.924616] compat.git: mlnx_ofed/mlnx-ofa_kernel-4.0.git
[    8.978887] mlxbf_gige MLNXBF17:00 oob_net0: renamed from eth0
[    9.033847] mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 50000000Hz, actual 50000000HZ div = 0)
[    9.041535] mlx5_core 0000:03:00.0: firmware version: 24.38.1002
[    9.053820] mmc0: new high speed MMC card at address 0001
[    9.065722] mlx5_core 0000:03:00.0: 252.048 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x16 link)
[    9.070733] virtio_net virtio1 tmfifo_net0: renamed from eth0
[    9.077033] mmcblk0: mmc0:0001 S0J58X 59.3 GiB 
[    9.121232]  mmcblk0: p1 p2
[    9.127384] mmcblk0boot0: mmc0:0001 S0J58X 31.5 MiB 
[    9.138536] mmcblk0boot1: mmc0:0001 S0J58X 31.5 MiB 
[    9.590981] mlx5_core 0000:03:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[    9.610630] mlx5_core 0000:03:00.0: E-Switch: Total vports 83, per vport: max uc(128) max mc(2048)
[    9.638455] mlx5_core 0000:03:00.0: Port module event: module 0, Cable plugged
[    9.653141] mlx5_core 0000:03:00.0: mlx5_pcie_event:295:(pid 131): PCIe slot power capability was not advertised.
[    9.672586] mlx5_core 0000:03:00.0: mlx5e: IPSec ESP acceleration enabled
[    9.687819] mlx5_core 0000:03:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 enhanced)
[    9.834169] mlx5_core 0000:03:00.1: firmware version: 24.38.1002
[    9.846334] mlx5_core 0000:03:00.1: 252.048 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x16 link)
[   10.363913] mlx5_core 0000:03:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[   10.383537] mlx5_core 0000:03:00.1: E-Switch: Total vports 83, per vport: max uc(128) max mc(2048)
[   10.411552] mlx5_core 0000:03:00.1: Port module event: module 1, Cable unplugged
[   10.426573] mlx5_core 0000:03:00.1: mlx5_pcie_event:295:(pid 147): PCIe slot power capability was not advertised.
[   10.444488] mlx5_core 0000:03:00.1: mlx5e: IPSec ESP acceleration enabled
[   10.461142] mlx5_core 0000:03:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 enhanced)
[   10.679120] mlx5_core 0000:03:00.1 p1: renamed from eth1
[   10.719535] mlx5_core 0000:03:00.0 p0: renamed from eth0
[   12.035046] raid6: neonx8   gen()  5608 MB/s
[   12.111046] raid6: neonx8   xor()  4129 MB/s
[   12.187045] raid6: neonx4   gen()  5688 MB/s
[   12.263048] raid6: neonx4   xor()  4266 MB/s
[   12.339045] raid6: neonx2   gen()  5030 MB/s
[   12.415047] raid6: neonx2   xor()  3924 MB/s
[   12.491045] raid6: neonx1   gen()  3910 MB/s
[   12.567045] raid6: neonx1   xor()  3208 MB/s
[   12.643050] raid6: int64x8  gen()  3012 MB/s
[   12.719049] raid6: int64x8  xor()  1841 MB/s
[   12.795049] raid6: int64x4  gen()  3331 MB/s
[   12.871046] raid6: int64x4  xor()  1930 MB/s
[   12.947045] raid6: int64x2  gen()  3146 MB/s
[   13.023050] raid6: int64x2  xor()  1715 MB/s
[   13.099048] raid6: int64x1  gen()  2442 MB/s
[   13.175050] raid6: int64x1  xor()  1277 MB/s
[   13.183622] raid6: using algorithm neonx4 gen() 5688 MB/s
[   13.194467] raid6: .... xor() 4266 MB/s, rmw enabled
[   13.204437] raid6: using neon recovery algorithm
[   13.215137] xor: measuring software checksum speed
[   13.226027]    8regs           :  8108 MB/sec
[   13.235803]    32regs          :  9643 MB/sec
[   13.246018]    arm64_neon      :  6708 MB/sec
[   13.254766] xor: using function: 32regs (9643 MB/sec)
[   13.265730] async_tx: api initialized (async)
[   13.318986] sdhci: Secure Digital Host Controller Interface driver
[   13.331465] sdhci: Copyright(c) Pierre Ossman
[   13.341170] sdhci-pltfm: SDHCI platform and OF driver helper
[   13.387288] Mellanox boot control driver (version 1.5)
[   13.402265] sbsa-gwdt sbsa-gwdt.0: Initialized with 10s timeout @ 200000000 Hz, action=0.
[   13.433619] 
[   13.438595] 
[   13.443644] 
[17:44:48] INFO: Ubuntu installation started
[   16.526123]  mmcblk0: p1 p2
[   17.544102]  mmcblk0: p1 p2
[17:44:52] Installing OS image
[   16.526123]  mmcblk0: p1 p2
[   17.544102]  mmcblk0: p1 p2
[17:44:52] Installing OS image
[   89.879281]  mmcblk0: p1 p2
[   89.897485] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[  325.641223]  mmcblk0: p1 p2
[  329.177936]  mmcblk0: p1 p2
[  332.275054]  mmcblk0: p1 p2
[17:50:10] INFO: Installation finished
[17:50:13] INFO: Rebooting...
[  341.433633] 
[  341.438486] 
[  341.443377] 
[  341.448245] 
[  341.452973] kvm: exiting hardware virtualization
[  341.476309] mlx5_core 0000:03:00.1: Shutdown was called
[  341.490135] mlx5_core 0000:03:00.0: Shutdown was called

Step 05.04: Enable NAT for the DPU to connect to the Internet

In lab configuration, only eth1 interface in the host is connected to the Internet. Hence, SmartNIC Linux needs to go through the host Linux to access the Internet. This post well explains how to setup NAT between them, so I just summarized the way here.

On the host

sudo -s

ip addr add 192.168.100.1/24 dev tmfifo_net0

echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE

In case IP forwarding is not working

iptables -A FORWARD -o eth1 -j ACCEPT
iptables -A FORWARD -m state --state ESTABLISHED,RELATED -i eth1 -j ACCEPT

On the SmartNIC

echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf

On the SmartNIC

$ ping google.com -c 2
PING google.com (74.125.21.100) 56(84) bytes of data.
64 bytes from yv-in-f100.1e100.net (74.125.21.100): icmp_seq=1 ttl=108 time=7.19 ms
64 bytes from yv-in-f100.1e100.net (74.125.21.100): icmp_seq=2 ttl=108 time=6.93 ms

Step 05.05: Manually update the target image firmware.

The other endpoint of tmfifo_net0 is in the BlueField-2 DPU, and its IPv4 address is 192.168.100.2/24, hence you can use ssh to access BlueField-2 DPU via this address.

First, we need to assign an address.

ip addr add 192.168.100.1/24 dev tmfifo_net0

ping 192.168.100.2 # check whether it is properly configured

# login to the bluefield-2 dpu
ssh ubuntu@192.168.100.2

# enter the password
ubuntu@192.168.100.2 password: ubuntu

# update the password

# update the target os image
sudo apt update; sudo apt upgrade;

Upgrade the firmware on the DPU, run:

sudo /opt/mellanox/mlnx-fw-updater/mlnx_fw_updater.pl --force-fw-update

Do not run a command to upgrade the target OS release. This will brick the DPU.

# warning: do not run this command !!!
# sudo do-release-upgrade

Downloads

  1. NVIDIA BlueField-2 DPU Firmware Downloads

User Guide

  1. BlueField DPUs & DOCA

  2. NVIDIA BlueField-2 InfiniBand/Ethernet DPU User Guide - Hardware Installation

  3. NVIDIA BlueField-2 Ethernet DPU User Guide - Hardware Installation

  4. Install DOCA SDK with SDK Manager

  5. NVIDIA BlueField-2 DPU Software Quick Start Guide

  6. Bring-Up and Driver Installation

  7. BlueFieldSWv36011699 - Installation and Initialization

  8. MLNXOFEDv531001 - Ethernet Interface

  9. BlueField DPU Administrator Quick Start Guide

  10. Upgrading Boot Software - NVIDIA BlueField DPU BSP v3.8.5


Technotes

  1. Changing Mellanox ConnectX VPI Ports to Ethernet or InfiniBand in Linux - STH - 20190407

  2. InfiniBand Command Examples - Oracle

  3. Infiniband Troubleshooting - Hasan Mansur

  4. 15 Useful “ifconfig” Commands to Configure Network Interface in Linux

  5. How to Enable (UP)/Disable (DOWN) Network Interface Port (NIC) in Linux?

  6. Configuring NVIDIA BlueField2 SmartNIC - 20220106

  7. Configuring NVIDIA BlueField2 SmartNIC - Insu Jang - 20220106


Tutorials

Mellanox Switches

  1. HowTo Get Started with Mellanox Switches - 20181203

NetPlan

  1. NetPlan Examples

Related Links

  1. Get Started with NVIDIA DOCA
1 Like