I’m currently working with a multi-port network card that I have plugged into the PCIe port of the Jetson AGX Orin Dev Kit. I’m running an early access of JetPack 7 (R38.1 EA) on this board and have ported the vendor’s drivers to the 6.8.12 kernel implementation used for this JP release.
This network card is capable of 25G+ speeds. In my early testing, I’ve noticed that all interrupts are running through CPU0. The vendor includes a tuning script that expects me to be able to modify smp_affinity (e.g. echo fff | sudo tee /proc/irq/278/smp_affinity). This fails. After some searching here on the forums, I discovered this post: [Jetson AGX Orin] Intel Network Card RX Interrupts Locked to CPU0 Despite Affinity Settings - #2 by WayneWWW
I manually patched what @delwyn provided to 6.8.12, but that similarly fails for me. I’ve noticed that dmesg reports “GICv2m: Failed to allocate v2m resource.” early in the boot process, so I apparently have some further work to.
I’m wondering if anyone else has ported this same patch to Linux 6.8.12 successfully? If so, have you seen any performance improvements? I’m hoping that might save me some time and effort. @WayneWWW has stated in the above thread that this is a “will not fix” issue, so it looks like it’s going to fall on the users to address this regression.
The current issue that I’m working through now is that with the patch applied, the network card driver seems to be inducing an IOMMU fault. I see this during the driver’s probe:
My next effort is to see what the driver is doing that is causing this.
Any ideas what this error is actually indicating? The “iova” value in the fault message is in the same address range as the new gic_v2m device tree entry:
I’m guessing the network card driver is trying to access this area and the fault is simply indicating that the driver doesn’t have the proper access level.
Maybe I need to take a step back here. @WayneWWW is there a way to manage CPU affinity using GICv3 without resorting to patching GICv2m back into the kernel for JP7?
What I’m seeing with my high-speed network card is that under iperf3 conditions, CPU0 is railing at 100% utilization. I haven’t confirmed it, but I’m wondering if this is also why I’m seeing a huge amount of “retr” counts during the iperf3 test when I would expect them to be at or near 0. If I can distribute things across CPUs (which is the intent of the GICv2m patch) hopefully I can get the performance I’m expecting.
Modified these from Thor to Orin, but change to your Orin’s network devices.
Receive Packet Steering. This allows CPU0 (the one that physically receives the interrupt) to hand off the packet processing to other idle cores in the system.
# Source: https://enterprise-support.nvidia.com/s/article/receive-packet-steering
# Following creates a bitmask (fff = cores 0-11) to distribute load.
for x in /sys/class/net/eno1/queues/rx-*/rps_cpus; do echo "fff" | sudo tee $x; done
Spread interrupts for nic interfaces. Rather than keep primarily pinned on one cpu or vm.
for dev in eno1; do
IRQS=$(grep $dev /proc/interrupts | awk '{print $1}' | tr -d ':')
CORE=0
for irq in $IRQS; do
echo $CORE | sudo tee /proc/irq/$irq/smp_affinity_list
CORE=$((CORE + 1)) # This moves each queue to a new core
done
done
dev=eno1
start_cpu=2
ncpus=$(nproc)
core=$start_cpu
IRQS=$(grep -E "${dev}\.vm[0-9]+" /proc/interrupts | awk '{print $1}' | tr -d ':')
for irq in $IRQS; do
cpu=$(( core % ncpus ))
echo "$cpu" | sudo tee "/proc/irq/$irq/smp_affinity_list" >/dev/null
core=$((core + 1))
done
See that it made the changes.
for irq in $IRQS; do
echo -n "$irq: "
cat /proc/irq/$irq/smp_affinity_list
done
for irq in $IRQS; do
echo -n "$irq: "
cat /proc/irq/$irq/smp_affinity
done
Unfortunately gic-v2m is the only way to spread PCIe MSI interrupts across all the CPU cores on Orin. Otherwise PCIe MSI interrupts are handled by the DesignWare PCIe host bridge core, which coalesces all interrupts into one SPI.
The of_irq_parse_pci error suggests problems in your device tree associated with the PCIe host bridge node. I would double check that the patch is correctly applied.
I attempted to implement the 6.6 patch on 6.8 and encounter the following, which is similar in nature to the results I got when porting the original patch by @delwyn :
If the problem isn’t simply the PCIe driver, it might be worth investigating how the GICD and GICR entries end up in /proc/iomem on your kernel. Perhaps some additional step is needed in gicv2m for GICA.
Great point on initrd! I haven’t deployed an updated initrd to the dev kit. I’ve been using rmmod and modprobe on pcie_tegra194. When I do that, I do see the Using GICv2m MSI allocator message. I’ll rebuild and deploy initrd to the target and see if that changes anything.
Yeah, I’m wondering what additional steps (if any) might be needed. That’s today’s investigation.
Well, I believe I figured out the issue. I made one mistake in manually applying the patch file. After doing a before / after compare, I saw the problem. The /proc/interrupts entry no longer shows that things are only running on CPU0: