I can see the BlueField DPU 2 on 'lspci', and the rshim service is running, but there is no rshim* in the /dev directory and I can't find tmfifo_net0

I can see the BlueField DPU 2 on ‘lspci’, and the rshim service is running, but there is no rshim* in the /dev directory. Does anyone meet the similar case and can offer some help?

use ifconfig command, no tmfifo_net0.

lspci | grep BlueField
61:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev ff)
61:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev ff)
61:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev ff)

systemctl status rshim
● rshim.service - rshim driver for BlueField SoC
Loaded: loaded (/lib/systemd/system/rshim.service; enabled; vendor preset:>
Active: active (running) since Tue 2023-10-31 15:54:23 GMT; 25s ago
Docs: man:rshim(8)
Process: 2609385 ExecStart=/usr/sbin/rshim $OPTIONS (code=exited, status=0/>
Main PID: 2609386 (rshim)
Tasks: 1 (limit: 154280)
Memory: 332.0K
CPU: 10ms
CGroup: /system.slice/rshim.service
└─2609386 /usr/sbin/rshim

ls /dev/r*
/dev/random /dev/rfkill /dev/rtc /dev/rtc0

ifconfig
enp1s0f0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 94:6d:ae:90:2b:0a txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp1s0f1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 94:6d:ae:90:2b:0b txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp36s0f0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 58:11:22:dd:51:ae txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp36s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.200.20.42 netmask 255.255.252.0 broadcast 10.200.23.255
inet6 fe80::67ca:e3c:585:2f0a prefixlen 64 scopeid 0x20
ether 58:11:22:dd:51:af txqueuelen 1000 (Ethernet)
RX packets 18456091 bytes 9601762587 (9.6 GB)
RX errors 1 dropped 0 overruns 0 frame 1
TX packets 1625579 bytes 216502112 (216.5 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp97s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 94:6d:ae:9b:95:c4 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp97s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 94:6d:ae:9b:95:c5 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10
loop txqueuelen 1000 (Local Loopback)
RX packets 184466 bytes 26165996 (26.1 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 184466 bytes 26165996 (26.1 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
ether 52:54:00:8a:29:54 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

wlp38s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.21.127.194 netmask 255.255.254.0 broadcast 10.21.127.255
inet6 fe80::6ac1:9f2c:3c6a:163d prefixlen 64 scopeid 0x20
ether 48:68:4a:4b:b9:96 txqueuelen 1000 (Ethernet)
RX packets 2695911 bytes 740098475 (740.0 MB)
RX errors 0 dropped 7206 overruns 0 frame 0
TX packets 342085 bytes 45648628 (45.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

cat 01-netcfg.yaml
network:
version: 2
renderer: networkd
ethernets:
enp36s0f0:
dhcp4: yes
tmfifo_net0:
addresses: [10.21.127.3/24]
dhcp4: no

ifconfig tmfifo_net0
tmfifo_net0: error fetching interface information: Device not found

hi

Does your BF2 card have BMC on it? if it have BMC.
The default Rshim port should on BMC side, you can disable the rshim from BMC side
Then HOST side can see the Rshim

Thank you
Meng, Shi

hi, thank you for your reply. Could you please tell me how to check if my card have BMC on it? Currently, I use systemctl status rshim to check the status of rshim.

I just read the instruction Connecting to BMC Interfaces - NVIDIA Docs.

It says, “The BMC interface eth0 is the management interface, so any information displayed by “ifconfig eth0” pertains to the management interface. The MAC address to be used for eth0 is pre-programmed in the BMC FRU EEPROM. By default, the IP address used for eth0 is acquired via DHCP but can be configured differently.”

But I don’t have “eth0”.

Hi, I have got the network interface, tmfifo_net0, after disconnecting power and reconnecting. Now in the message of “rshim”, it shows “rshim0 read_rshim error 0”. Do you have any suggestions?

…I restarted the rshim.service, the tmfifo_net0 disappeared again!!!

hi

please check the PN of BF2(lspci -vvvvv), to get the detial type of the card
If you do not change any SW config but have different behaivor, please check HW install first, some type of BF2 need external power(like 100G):
https://docs.nvidia.com/networking/display/bluefield2dpuenug/hardware+installation

for the PN detail(have BMC, speed, etc.) please check this doc:
https://docs.nvidia.com/networking/display/bluefield2dpuenug

after this if still have question, I suggest you contact networking-support@nvidia.com to get further help.

Thank you
Meng, Shi

61:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev ff) (prog-if ff)
!!! Unknown header type 7f
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core

61:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev ff) (prog-if ff)
!!! Unknown header type 7f
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core

61:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev ff) (prog-if ff)
!!! Unknown header type 7f
Kernel driver in use: vfio-pci

@ ziji.chen
I have a same problem. Have you solved this ? If you have, could you share your solution?

Hi, I had a bit of an issue with my card overheating, so I added some fans to my workstation. It seems to be working well now.

Overheating of the card can cause a lot of uncontrollable problems, it is recommended to check the dmesg log after booting for a while.

1 Like

Thank you for your advice, ziji.chen
I’ve finally got tmfifo_net0 and /dev/rshim0 , and log in DPU successfully.

Thank you
daikiti