I’m sending microbursts of UDP packets of up to 8MB at 2Hz from another PC (Windows) to Jetson TX2
and about half of the packets get lost (don’t see them even in Wireshark).
Some tests I have done:
1 - I tried to send the packets to another client and all the data arrived so I know that the receiver is the problem here.
2- Sent every UDP fragment (~60KB) with 2ms of delay from the previous – 97% of the data arrives, from here I know that the problem is from microbursts and some buffer in the receiver side (that should compensate for the momentary bandwidth pick) overflows
3- Increased SO_RCVBUF size to 40MB – no changes
4- Increased the following sysctl settings to bigger values than its default:
o /proc/sys/net/core/netdev_max_backlog to 5000
o /proc/sys/net/core/rmem_default to 40MB
o /proc/sys/net/core/wmem_default to 40MB
o /proc/sys/net/core/rmem_max to 40MB
o /proc/sys/net/core/wmem_max to 40MB
o /proc/sys/net/core/netdev_budget to 1000
None of these modifications in sysctl solved the problem either.
5- When running “sysctl –a | grep rx” I see two indications of overflow that increases each time I run the sender again:
o mmc_rx_fifo_overflow: 17263
o rx_buf_unavailable_irq_n: 16883098
6- ifconfig eth0 shows RX errors 0 and a low number of dropped packets, which lead me to think that the drop occurs before the OS layer and tried number 7:
7- Trying to expand the RX_Ring_Size but Nvidia driver (EQOS) doesn’t support it : Increase receive buffer size - Jetson & Embedded Systems / Jetson TX2 - NVIDIA Developer Forums
8- Only for curiosity and lack of new ideas I tried to increase the MTU in both sides to maximum (9014) and saw a substantial improvement (don’t know why, since the same memory is required to the receive buffers), but still losing packets.
Would be glad to listen any new idea of how to procceed in changing RX_Ring_Size or any other lead you advice.
Operating System: L4T 32.4.4
Ethernet driver: EQOS
Baremetal or Container (if container which image + tag): Baremetal