rshim shows "another backend already attached", and tmfifo_net0 cannot be found

I have a BlueField-2 DPU installed on my server, but when I try to start rshim, it says “another backend already attached”. I don’t even have other PCIe devices attached other than the DPU. What could go wrong?

below are the logs of systemctl status rshim:

shujunyi@poweredge0-PowerEdge-R740:~$ sudo systemctl status rshim

rshim.service - rshim driver for BlueField SoC

Loaded: loaded (/lib/systemd/system/rshim.service; enabled; vendor preset: enabled)

Active: active (running) since Mon 2021-09-27 19:17:28 CST; 2min 31s ago

Docs: man:rshim(8)

Process: 2979 ExecStart=/usr/sbin/rshim $OPTIONS (code=exited, status=0/SUCCESS)

Main PID: 3043 (rshim)

Tasks: 2 (limit: 6143)

CGroup: /system.slice/rshim.service

└─3043 /usr/sbin/rshim

9月 27 19:17:28 poweredge0-PowerEdge-R740 systemd[1]: Starting rshim driver for BlueField SoC…

9月 27 19:17:28 poweredge0-PowerEdge-R740 systemd[1]: Started rshim driver for BlueField SoC.

9月 27 19:17:29 poweredge0-PowerEdge-R740 rshim[3043]: Probing pcie-0000:5e:00.2

9月 27 19:17:29 poweredge0-PowerEdge-R740 rshim[3043]: create rshim pcie-0000:5e:00.2

9月 27 19:17:29 poweredge0-PowerEdge-R740 rshim[3043]: another backend already attached

return of ifconfig:

br-cd4cb1507b28: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

inet 172.18.0.1 netmask 255.255.0.0 broadcast 172.18.255.255

inet6 fe80::42:19ff:fe86:48ad prefixlen 64 scopeid 0x20

ether 02:42:19:86:48:ad txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 4 bytes 386 (386.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255

ether 02:42:09:de:d3:e9 txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 162.105.16.151 netmask 255.255.255.0 broadcast 162.105.16.255

inet6 2001:da8:201:1016:2eea:7fff:feee:4208 prefixlen 64 scopeid 0x0

inet6 fe80::2eea:7fff:feee:4208 prefixlen 64 scopeid 0x20

ether 2c:ea:7f:ee:42:08 txqueuelen 1000 (Ethernet)

RX packets 108289 bytes 9393882 (9.3 MB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 451 bytes 87520 (87.5 KB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

device interrupt 75

eno2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

ether 2c:ea:7f:ee:42:09 txqueuelen 1000 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

device interrupt 77

eno3: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

ether 2c:ea:7f:ee:42:0a txqueuelen 1000 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

device interrupt 78

eno4: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

ether 2c:ea:7f:ee:42:0b txqueuelen 1000 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

device interrupt 80

enp94s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 192.168.1.1 netmask 255.255.0.0 broadcast 192.168.255.255

inet6 fe80::ac0:ebff:fe2c:d78c prefixlen 64 scopeid 0x20

ether 08:c0:eb:2c:d7:8c txqueuelen 1000 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 74 bytes 7997 (7.9 KB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp94s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 192.168.1.2 netmask 255.255.0.0 broadcast 192.168.255.255

inet6 fe80::ac0:ebff:fe2c:d78d prefixlen 64 scopeid 0x20

ether 08:c0:eb:2c:d7:8d txqueuelen 1000 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 60 bytes 7157 (7.1 KB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536

inet 127.0.0.1 netmask 255.0.0.0

inet6 ::1 prefixlen 128 scopeid 0x10

loop txqueuelen 1000 (Local Loopback)

RX packets 164 bytes 12672 (12.6 KB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 164 bytes 12672 (12.6 KB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth951e6ee: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

ether 9e:ec:7c:45:b0:15 txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vethd0dd2f0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

ether 72:be:4e:92:00:59 txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vethdd6b52a: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

ether c6:2d:ec:c5:69:85 txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

shujunyi@poweredge0-PowerEdge-R740:~$ lspci | grep nox

5e:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)

5e:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)

5e:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)

Hello Junyi Shu,

Thanks for your question.

It’s possible to access the console of DPU via PCIe interface, as well as via usb cable,

which may be connected to another server. The rshim driver that was attached first,

will allow you connecting the DPU. If you will start the second driver, either via usb,

or via PCIe, you will see “another backend already attached” message in logs.

This means the first driver that was attached to device will allow you to access the DPU.

Best Regards,

Anatoly

I figured out how to reproduce it.

When I reboot the server without power cycling it, it seems rshim does not terminate properly.

So when the server is up again, rshim service cannot restart as it thinks there is another driver while there is actually none.

At this moment, what I do is:

  1. whenever I try to reboot the server, I stop rshim first

  2. If someone reboots the server without doing so, I power cycle it again

It works, but it will be good to fix it in rshim