Eth loss packet

837535053 · January 15, 2025, 9:46am

Jetson AGX Orin

JetPack 5.1.2

Problem description:

The interruption distribution of the network port is uneven，This may result in a probability of packet loss。
cat /proc/interrupts:

like the image, The number of interrupts on CPU2 is much higher than on other cores. It is also possible that another kernel has a much higher number of interrupts, which is random.
RPS as follow：

I tried to modify rps_cpus and rps_flow_cnt values, but after completing the modifications, there was no effect.

thanks

KevinFFF · January 16, 2025, 1:32am

Hi 837535053,

Are you using the devkit or custom board for AGX Orin?

Is MSI ******* your custom network device since I cannot find it on AGX Orin devkit?

Please refer to IRQ Balancing - #6 by sumitg to configure the other cores to handle the interrupt.

$ sudo su
# cd /proc/irq/331
# cat smp_affinity
# cat smp_affinity_list
# echo ff > smp_affinity

Please also refer to R36.3 Patch to re-enable GICv2m for PCIe MSI interrupts and restore I/O performance - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums and check if those patches can help in your case.

837535053 · January 16, 2025, 7:21am

HI。

This uses our self-developed motherboard and NVIDIA ORIN module.

I try two method as follow:

Automatically balancing interrupts with irqbalance
Manually configure interrupt distribution using the method you described

image791×264 52.5 KB

the effective num is not change.

But neither method works, it feels like only one CPU’s interrupt count accumulates for each interrupt label over a long period of time。

thanks

delwyn · January 16, 2025, 12:58pm

For modern NICs that support RSS the driver usually allocates one receive queue per CPU core and the interrupt affinity is set to bind each queue to a specific CPU (as in your first image).

Received packets are directed by the NIC hardware to a queue based on a hash of the packet headers. So the distribution of load across the CPUs depends on the number of flows being received and the relative number of packets in each flow.

RPS won’t help unless you have more CPUs than queues, or your NIC doesn’t support RSS.

Looks like you may be trying to receive a lot of data from a single connection - this will all go to a single queue and therefore one CPU core. If that CPU core is 100% busy then not a lot you can do about it other than increase the packet size to reduce CPU overhead, and ensure any hardware offloading features supported by the NIC are enabled.

See https://www.kernel.org/doc/Documentation/networking/scaling.txt

My patch for R36.3 referred to above won’t help as this code is already present in JetPack 5.1.2, but you’d definitely need it if you ever update to JetPack 6. Without the patch the interrupts for all the queues would be handled on CPU #0

system · February 26, 2025, 5:06am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Jetson AGX Orin] Intel Network Card RX Interrupts Locked to CPU0 Despite Affinity Settings Jetson AGX Orin networking	2	190	June 13, 2025
Problematic interrupt load balancing for Microchip LAN7430 Jetson Orin Nano board-design , ethernet	5	212	September 24, 2025
Network Packet Loss for Xavier NX vs Orin NX Jetson Orin NX nvbugs , networking	11	2172	June 27, 2023
IRQ Balancing Jetson AGX Xavier ethernet	17	5343	October 18, 2021
Packet Drop on Orin mgbe0/1 Jetson AGX Orin ethernet	15	1599	January 25, 2023
AGX Orin R36.3 cannot change IRQ smp affinity Jetson AGX Orin pcie	3	204	July 16, 2024
IRQ Balancing Linux	3	463	October 12, 2021
Ethernet Port - Strange/Unusual latency spikes pattern Jetson Orin NX ethernet	19	486	October 22, 2024
Manually modifying smp_affinity Jetson AGX Orin networking	19	51	March 4, 2026
Jeston origin NX 网卡中断绑定到CPU失败 Jetson Orin NX pcie , kernel	5	105	February 12, 2026

Eth loss packet

Related topics