Oob ethernet issue on four BF-2 cards

Hi,

 I have 4 BF-2 MBF2M516A-EENOT  where OOB ports stopped working. First 3 stopped working, and then the last one failed. After much troubleshooting, I decided to install bf-bundle-3.0.0-135_25.04_ubuntu-22.04_prod.bfb

After reboot, set password and hostname, 

root@n017-dpu2:~# dmesg | grep oob
[ 9.154002] mlxbf_gige MLNXBF17:00 oob_net0: renamed from eth0
[ 20.141585] mlxbf_gige MLNXBF17:00 oob_net0: Link is Down
[ 23.286883] mlxbf_gige MLNXBF17:00 oob_net0: Link is Up - 1Gbps/Full - flow control off
[ 23.286924] IPv6: ADDRCONF(NETDEV_CHANGE): oob_net0: link becomes ready

Configured oob_net0 as follows.

oob_net0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.128.1.21 netmask 255.255.254.0 broadcast 10.128.1.255
inet6 fe80::bace:ZZZZ:XXXX:YYCC prefixlen 64 scopeid 0x20
ether b8:ce:f6:XX:YY:CC txqueuelen 1000 (Ethernet)
RX packets 4731 bytes 284688 (284.6 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1256 bytes 65985 (65.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

root@n017-dpu2:~# netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 192.168.100.1 0.0.0.0 UG 0 0 0 tmfifo_net0
10.128.0.0 0.0.0.0 255.255.254.0 U 0 0 0 oob_net0
192.168.100.0 0.0.0.0 255.255.255.252 U 0 0 0 tmfifo_net0

Still I can’t get oob_net0 operating again.

I’ve swapped ports on switch, replaced cables, etc, etc. Any clues on what is going on?

I tried to change /etc/netplan/50-cloud-init.yaml routes to 10.128.1.254/23,

        routes:
        -   metric: 1025
            to: 0.0.0.0/0
            via: 192.168.100.1

root@n017-dpu2:~# cat /etc/netplan/50-cloud-init.yaml

This file is generated from information provided by the datasource. Changes

to it will not persist across an instance reboot. To disable cloud-init’s

network configuration capabilities, write a file

/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:

network: {config: disabled}

network:
ethernets:
oob_net0:
dhcp4: false
addresses:
- 10.128.1.21/23
nameservers:
addresses:
- 10.128.0.3
tmfifo_net0:
addresses:
- 192.168.100.2/30
dhcp4: false
nameservers:
addresses:
- 192.168.100.1
routes:
- metric: 1025
to: 0.0.0.0/0

via: 192.168.100.1

            via: 10.128.1.254
renderer: NetworkManager
version: 2

Changed route via to default route 10.128.1.254, but no luck.

Suggestions !

Hi!

You mentioned it’s “not operating again”,

Is it not pingable from another host?
Can it not reach out to other nodes?

Based on your routing table:
Destination Gateway Genmask Flags Iface
default 192.168.100.1 0.0.0.0 UG tmfifo_net0
10.128.0.0 0.0.0.0 255.255.254.0 U oob_net0

There’s no default route via oob_net0 — only a route for 10.128.0.0/23, which means you can only reach addresses in 10.128.0.0-10.128.1.255 over oob_net0.
If you’re trying to use it for external access (or management plane) and your targets are outside 10.128.0.0/23, you’ll need either:
• A default route via oob_net0
• Or specific routes for the desired destinations.

Check ping to gateway on oob_net0 subnet (if one exists):
ping -I oob_net0 10.128.1.1
If no gateway, test other peers in 10.128.0.0/23.

Check ARP table:
ip neigh show dev oob_net0
See if you’re resolving neighbors properly.

Test outbound connectivity:
ping -I oob_net0 10.128.1.X
ping -I oob_net0 8.8.8.8
(if there’s a route via oob_net0 to the destination)

Check if something’s blocking traffic (iptables/nftables):
iptables -L -v -n
nft list ruleset

See if any traffic is hitting the interface at all:
tcpdump -i oob_net0 -nn -v
Watch for inbound/outbound packets

If ping is used:
tcpdump -i oob_net0 -nn icmp
Should see ICMP echo requests and replies if working.

ARP activity:
tcpdump -i oob_net0 -nn arp

If you’re trying to reach 10.128.1.5, for example:
tcpdump -i oob_net0 -nn host 10.128.1.5

ip route get 10.128.1.5