We are encountering an issue with our Jetson Xavier NX on a custom carrier board using a kernel version based on L4T 32.4.3.
We are running an application that streams video from 3 cameras over a network using RTSP. Recently we changed the RTSP handling method slightly, where now our application connects to a docker image running rtsp-simple-server (GitHub - bluenviron/mediamtx: Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy and record video and audio streams.), and our client application connects over the network to the docker container (previously we had no docker container, so the client connected directly to the application). What we are seeing now is that after a couple of hours of streaming, there are several kernel messages about the TX ring buffer queue being full, and some time after that all communication is lost over the ethernet interface. There is no ping to or from the machine, although ifconfig still shows an IP address. I can also see that the output of ‘ethtools -S eth0’ has the tx and rx packet no longer increasing. I can recover the eth0 interface by bringing it down and up again using ifconfig.
I’m not sure if any of this information is useful but: I have tried changing the size of the TX buffer using ethtools but it seems the eqos driver doesn’t have an interface for doing this. I can also see that the eqos driver has 8 buffers but only one is ever used (index 0). We have also tried running simple-rtsp-server outside of the docker container and see the same results.
Even if the TX buffer is genuinely full, I don’t expect this to completely kill all connection over the ethernet interface, so what’s going on? Any ideas on how to further troubleshoot this?