Jetson TX2 UDP microbursts

Description:
I’m sending microbursts of UDP packets of up to 8MB at 2Hz from another PC (Windows) to Jetson TX2
and about half of the packets get lost (don’t see them even in Wireshark).

Some tests I have done:

1 - I tried to send the packets to another client and all the data arrived so I know that the receiver is the problem here.
2- Sent every UDP fragment (~60KB) with 2ms of delay from the previous – 97% of the data arrives, from here I know that the problem is from microbursts and some buffer in the receiver side (that should compensate for the momentary bandwidth pick) overflows
3- Increased SO_RCVBUF size to 40MB – no changes
4- Increased the following sysctl settings to bigger values than its default:
o /proc/sys/net/core/netdev_max_backlog to 5000
o /proc/sys/net/core/rmem_default to 40MB
o /proc/sys/net/core/wmem_default to 40MB
o /proc/sys/net/core/rmem_max to 40MB
o /proc/sys/net/core/wmem_max to 40MB
o /proc/sys/net/core/netdev_budget to 1000

None of these modifications in sysctl solved the problem either.

5- When running “sysctl –a | grep rx” I see two indications of overflow that increases each time I run the sender again:
o mmc_rx_fifo_overflow: 17263
o rx_buf_unavailable_irq_n[0]: 16883098

6- ifconfig eth0 shows RX errors 0 and a low number of dropped packets, which lead me to think that the drop occurs before the OS layer and tried number 7:
7- Trying to expand the RX_Ring_Size but Nvidia driver (EQOS) doesn’t support it : Increase receive buffer size - Jetson & Embedded Systems / Jetson TX2 - NVIDIA Developer Forums
8- Only for curiosity and lack of new ideas I tried to increase the MTU in both sides to maximum (9014) and saw a substantial improvement (don’t know why, since the same memory is required to the receive buffers), but still losing packets.

Would be glad to listen any new idea of how to procceed in changing RX_Ring_Size or any other lead you advice.

Environment:
Jetson TX2
Operating System: L4T 32.4.4
Jetpack 4.4.1
Ethernet driver: EQOS
Baremetal or Container (if container which image + tag): Baremetal

Hi,

Is this issue reproducible with TX2 devkit and rel-32.7.3?

1 Like

Don’t know for sure, I reproduce it with Jetson TX2 production module (not Devkit) and L4T release 32.4.4.

Hello, I have been trying to solve UDP transmission logging with TX2 for almost a year now. My use-case is just logging all incoming traffic into a pcap file. I tried everything you mentioned, without getting stable and reproducible results - once my solution logged all the traffic, other times the data stream was doing bursts as you described. (my data stream is around 6MB per second, in the problematic scenario there was a gap of 2-3 seconds and then a burst of 80MB/s that the system cannot handle by any means).

Even more weird aspect was that when I was trying to log the data during a desktop session, the data stream seemed much more reliable than when using a headless system, what did not make any sense to me.

A couple of days ago I came across the information that this may have something to do with the ARM power management, although I haven’t noticed any excessive CPU usage on any of the cores.

My recent solution enables jetson_clocks and this weird behavior seems to disappear, what confirms the power management hypothesis from my perspective.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.