Orin NX R36.2 cannot change IRQ smp affinity

Hello,

We have a network interface on the C1 - M.2 Key-E (pcie@14100000) connector.

Tegra sets all IRQ affinities to CPU0 by default. To optimize CPU load and increase throughput we reassigned IRQ smp affinities of this NIC accross CPU0-CPU7.

This was working fine and effective on R35.4.1 and previous versions.

However, on R36.2 we are unable to set the affinity with the following error :

root@jetson-001:~# echo 1 > /proc/irq/247/smp_affinity
-bash: echo: write error: Invalid argument

In attached /proc/interrupts IRQ# 247-262 of the NIC are all on CPU0 with R36.2.
proc-interrupts-r36.2.txt (22.0 KB)

With R35.3 the IRQs (313-328) are nicely distributed.
proc-interrupts-r35.3.txt (24.1 KB)

It is also worth noting, that the nvme IRQs are all attached to CPU0 on R36 while they are distributed by default in R35.3. Trying to remap leads to different error (but same error for both R36 and R35) :

root@jetson-001:~# echo 1 > /proc/irq/225/smp_affinity
-bash: echo: write error: Input/output error

Looking at the dtbs, seems like R36 got rid of the msi-controller present in R35 :

interrupt-controller@f400000 {
	...
	v2m@f410000 {
		compatible = "arm,gic-v2m-frame";
		msi-controller;
		#msi-cells = <0x01>;
		reg = <0x00 0xf410000 0x00 0x10000 0x00 0x54000000 0x00 0x4000000>;
		reg-names = "gic_base\0msi_base";
		arm,msi-base-spi = <0x260>;
		arm,msi-num-spis = <0x46>;
		phandle = <0x53>;
	};
};

f410000 is nowhere to be found in R36.2 new p3767/p3768 dtbs.

On the pcie@14100000 the following are not present anymore :

pcie@14100000 {
	...
	msi-parent = <0x53 0x05>;
	msi-map = <0x00 0x53 0x05 0x1000>;
	...
}

Sample dmesg output from R36.2:
dmesg-r36.2.txt (58.9 KB)

cat /etc/nv_tegra_release

# R36 (release), REVISION: 2.0, GCID: 34956989, BOARD: generic, EABI: aarch64, DATE: Thu Nov 30 19:03:58 UTC 2023
# KERNEL_VARIANT: oot

and

# R35 (release), REVISION: 3.1, GCID: 32827747, BOARD: t186ref, EABI: aarch64, DATE: Sun Mar 19 15:19:21 UTC 2023

Could someone help us out ? Would be great to have this feature back.

Thank you

Hi thegtx25,

Are you using the devkit or custom board for Orin NX?

Have you tried to add msi-controller and f410000 back in R36 and check if the behavior would be the same as R35?

Hi KevinFFF,

We are using a Seeedstudio A603 carrier which is similar to devkit. There is no need for specific dtbs update to have it work with R35 or R36.

I have tried putting back the msi-controller with GIC_V2M as in R35 and attaching the C1 pcie port to it.

I have assumed that the SPIs for PCIe MSI address and size have not changed :
in Linux_for_Tegra/source/hardware/nvidia/t23x/nv-public :

diff --git a/include/kernel/dt-bindings/interrupt-controller/arm-gic.h b/include/kernel/dt-bindings/interrupt-controller/arm-gic.h
index 35b6f69..b11d901 100644
--- a/include/kernel/dt-bindings/interrupt-controller/arm-gic.h
+++ b/include/kernel/dt-bindings/interrupt-controller/arm-gic.h
@@ -20,4 +20,8 @@
 #define GIC_CPU_MASK_RAW(x) ((x) << 8)
 #define GIC_CPU_MASK_SIMPLE(num) GIC_CPU_MASK_RAW((1 << (num)) - 1)
 
+/* SPIs which are available for PCIe MSI interrupts */
+#define GIC_SPI_MSI_BASE               608
+#define GIC_SPI_MSI_SIZE               70
+
 #endif

Attempt 1)

diff --git a/tegra234.dtsi b/tegra234.dtsi
index dad3f53..cc2cb7e 100644
--- a/tegra234.dtsi
+++ b/tegra234.dtsi
@@ -1846,6 +1846,17 @@
                        #redistributor-regions = <1>;
                        #interrupt-cells = <3>;
                        interrupt-controller;
+                       ranges; /* v2m fails to allocate without this */
+
+                       gic_v2m: v2m@f410000 {
+                               compatible = "arm,gic-v2m-frame";
+                               msi-controller;
+                               #msi-cells = <1>;
+                               reg = <0x0 0x0f410000 0x0 0x00010000    /* GICA */
+                                       0x0 0x54000000 0x0 0x04000000>;
+                               reg-names = "gic_base", "msi_base";
+                               arm,msi-base-spi = <GIC_SPI_MSI_BASE>;
+                               arm,msi-num-spis = <GIC_SPI_MSI_SIZE>;
+                       };
                };
 
                smmu_iso: iommu@10000000 {
@@ -2487,6 +2498,8 @@
                        interconnect-names = "dma-mem", "write";
                        iommu-map = <0x0 &smmu_niso1 TEGRA234_SID_PCIE1 0x1000>;
                        iommu-map-mask = <0x0>;
+                       msi-parent = <&gic_v2m TEGRA234_SID_PCIE1>;
+                       msi-map = <0x0 &gic_v2m TEGRA234_SID_PCIE1 0x1000>;
                        dma-coherent;
 
                        status = "disabled";

No success. The system fails to allocate MSIs at startup.

I also tried with enabling ITS MSI controller of GICv3 as per the documentation.. Same issue at allocation.

Attempt 2)

diff --git a/tegra234.dtsi b/tegra234.dtsi
index dad3f53..8e7b344 100644
--- a/tegra234.dtsi
+++ b/tegra234.dtsi
@@ -1846,6 +1846,17 @@
                        #redistributor-regions = <1>;
                        #interrupt-cells = <3>;
                        interrupt-controller;
+                       msi-controller;
+                       mbi-ranges = <GIC_SPI_MSI_BASE GIC_SPI_MSI_SIZE>;
+
+                       gic_its: gic-its@f410000 {
+                               compatible = "arm,gic-v3-its";
+                               msi-controller;
+                               #msi-cells = <1>;
+                               reg = <0x0 0x0f410000 0x0 0x00010000    /* GICA */
+                                       0x0 0x54000000 0x0 0x04000000>;
+                               reg-names = "gic_base", "msi_base";
+                       };
                };
 
                smmu_iso: iommu@10000000 {
@@ -2487,6 +2498,8 @@
                        interconnect-names = "dma-mem", "write";
                        iommu-map = <0x0 &smmu_niso1 TEGRA234_SID_PCIE1 0x1000>;
                        iommu-map-mask = <0x0>;
+                       msi-parent = <&gic_its TEGRA234_SID_PCIE1>;
+                       msi-map = <0x0 &gic_its TEGRA234_SID_PCIE1 0x1000>;
                        dma-coherent;
 
                        status = "disabled";

I have tried a couple of other options without success.
Note: I have only modified, recompiled the dtbs and changed FDT entry in extlinux.conf but I did not rebuild the kernel in these experiments.

Seems to me that this is either a regression or a missing feature in R36.2 DP.
This affects IRQ affinity of nvme and other PCI-MSI devices.

In our case, system is throttling because of all devices having IRQs on CPU0.

Thank you for your assistance.

Update: after rebuilding the kernel with 1) and added ranges property, MSI are allocated and V2m shows properly configured in dmesg:

[    0.000000] GICv2m: DT overriding V2M MSI_TYPER (base:608, num:70)
[    0.000000] GICv2m: range[mem 0x0f410000-0x0f41ffff], SPI[608:677]

however with this error:

[   71.210969] tegra-mc 2c00000.memory-controller: pcie1w: write @0x000000000f410040: EMEM address decode error (EMEM decode error)
[   71.218804] tegra-mc 2c00000.memory-controller: pcie1w: write @0x000000000f410040: EMEM address decode error (EMEM decode error)

Hinted by this log from R35 not being present in R36:

[    6.083493] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[    6.091030] tegra194-pcie 14160000.pcie: Adding to iommu group 8

I dug into drivers/pci/controller/dwc/pcie-tegra194.c and the new upstreamed R36 version does not use GICv2m as an MSI allocator if possible (functions related to tegra_pcie_parse_msi_parent do not exist anymore). So no chances to get this to work this way.

Update 2 : This seems to be related to this 2020 discussion with Vidya Sagar from Nvidia and author of pcie-tegra194.c module.

I’ve checked this with internal that we remove GIC V2M for MSIs because we couldn’t upstream the changes due to HW deviation from arm spec. So, PCIe MSI irq affinity for Orin works on R35.X but not on latest releases(R36.X).

We didn’t see any perf difference with/without GIC V2M MSI so that it is not supported in JP6(R36.x).

Performance drop is clear with some additional devices.

In our case, with 1 NVMe PCIe 4.0 and a 3.5GB/s network adapter at maximum MTU the system throttles, CPU0 shows 100% kernel space usage and BW is much lower compared to spreaded IRQs (MAXN).
This not counting extra USB3.0 devices with IRQs also mapped to CPU0.

What is the roadmap for implementing a better, arm compliant, msi controller ? First release of 36.2 ?

Currently there is no plan to add this.

If you have performance concern, please share us how to reproduce this performance drop situation on devkit and we will try to see if this could be implemented or not.