Xavier AGX PCIe configuration

We have a project that requires a high-speed 32-channel PCIe DAQ (analog to digital converter) and we are using the Xavier AGX dev kit. The manufacturer of the PCIe card (General Standards) is having trouble getting their driver to work properly with the AGX and sent me the following information. This is very time sensitive, so I appreciate a quick response:

(From the manufacturer)
I’ve identified the problem.

  1. The problem is that the host’s Plug-N-Play configuration of the GSC
    card is ignoring certain configuration bits when making physical address
    assignments. The 18AI32SSC1M, and virtually all GSC devices, are
    configured to have their BAR0 and BAR2 regions assigned to 32-bit
    addresses. This is specified with bits D1 and D2 within both of the BAR0
    and BAR2 PCI registers. This can be verified by issuing the command
    “lspci -vv” (two v’s) and looking at the “Region 0” and “Region 2” lines
    towards the very bottom of the output (PCI device 5:02). Those same two
    lines however show the physical addresses assigned to each region. What
    I see are 37-bit addresses. While the 18AI32SSC1M can support 64-bit
    addresses the board is configured for 32-bit addresses, which is what
    the driver is designed to support. I initially verified this with a
    different I/O card which was limited to 32-bit addresses. In this case
    the board was still assigned addresses which the board did not support.
    In the end, the driver used the lower 32-bits of the physical address to
    try access the device. Given that the 32-bits of physical address were
    not the correct address, the virtual addresses given to the driver to
    access device registers were also wrong. When those virtual addresses
    were use result was a system reboot.

NOTE: With my modified driver I was able to look at the BAR0 and BAR2
registers and verified that they were configured for 32-bit addressing.

  1. A similar issue appears with the Plug-N-Play initialization of the
    BAR1 region. While this region is unused by the driver, it is not
    initialized correctly. The “lspci -vv” output for “Region 1”, towards
    the bottom of the output, shows how the region should have been
    initialized. With my modified driver I was able to read the BAR1
    register. Its assigned address was zero though the lspci output shows it
    should be otherwise.

I recommend that NVIDIA be contacted regarding this issue. I’ll be glad
to participate in any such email exchange if you desire. If they ask,
the 18AI32SSC1M uses a PLX PCI9056 PCI Bus Interface chip.

Can anyone at NVIDIA please help?!

Sorry for the late response, I will inform internal team to see if can provide suggestions. Thanks

1 Like

Hi,

Xavier AGX support 32-bit non-prefectable and 64-bit prefectable BARs. MMIO address in 32-bit range is limited(30MB only), so by default we are using 64-bit BAR range with >1GB of size support. If you need 32-bit BARs then you need to update the “ranges” property in pcie node in device tree for your project.
file: /hardware/nvidia/soc/t19x/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi
Property = “ranges”
Format: ranges = <“Fixed tuple” “Upper PCI address” “lower PCI address” “Upper CPU address” “Lower CPU address” “Upper size” “Lower size” >;
As per your needs update PCI and CPU address for “prefetchable memory” and “non-prefetchable memory”.
PCI and CPU address can be set same same from below table.

Controller DT node 32-bit base address Total size
C0 pcie@14180000 0x38200000 0x1E00000
C1 pcie@14100000 0x30200000 0x1E00000
C2 pcie@14120000 0x32200000 0x1E00000
C3 pcie@14140000 0x34200000 0x1E00000
C4 pcie@14160000 0x36200000 0x1E00000
C5 pcie@141a0000 0x3a200000 0x1E00000

Thanks,
Manikanta

1 Like

Thanks for the response, but we need more information to make the config changes you’re suggesting. Below is the pcie@14160000 property in the dtsi file, which is quite different from the example you provided. We need to know exactly what changes to make to accomplish what you’re suggesting:

  1. You mentioned that by default the AGX uses a 64-bit BAR range. How do we select a 32-bit BAR range?
  2. The default 32-bit BAR range is 30MB. How do we increase it and what is the maximum value allowed?
  3. What do you mean by “PCI and CPU address can be set same same from below table”? Where would we make this change?

    pcie@141a0000 {
            compatible = "nvidia,tegra194-pcie", "snps,dw-pcie";
            power-domains = <&bpmp TEGRA194_POWER_DOMAIN_PCIEX8A>;
            reg = <0x00 0x141a0000 0x0 0x00020000   /* appl registers (128K)      */
                   0x00 0x3a000000 0x0 0x00040000   /* configuration space (256K) */
                   0x00 0x3a040000 0x0 0x00040000>; /* iATU_DMA reg space (256K)  */
            reg-names = "appl", "config", "atu_dma";

            status = "disabled";

            #address-cells = <3>;
            #size-cells = <2>;
            device_type = "pci";
            num-lanes = <8>;
            linux,pci-domain = <5>;

            clocks = <&bpmp_clks TEGRA194_CLK_PEX1_CORE_5>,
                    <&bpmp_clks TEGRA194_CLK_PEX1_CORE_5M>;
            clock-names = "core_clk", "core_clk_m";

            resets = <&bpmp_resets TEGRA194_RESET_PEX1_CORE_5_APB>,
                     <&bpmp_resets TEGRA194_RESET_PEX1_CORE_5>;
            reset-names = "core_apb_rst", "core_rst";

            interrupts = <0 53 0x04>,       /* controller interrupt */
                                     <0 54 0x04>;   /* MSI interrupt */
            interrupt-names = "intr", "msi";

            pinctrl-names = "pex_rst", "clkreq";
            pinctrl-0 = <&pex_rst_c5_out_state>;
            pinctrl-1 = <&clkreq_c5_bi_dir_state>;

            iommus = <&smmu TEGRA_SID_PCIE5>;
            dma-coherent;

#if LINUX_VERSION >= 414
iommu-map = <0x0 &smmu TEGRA_SID_PCIE5 0x1000>;
iommu-map-mask = <0x0>;
#endif

            #interrupt-cells = <1>;
            interrupt-map-mask = <0 0 0 0>;
            interrupt-map = <0 0 0 0 &intc 0 53 0x04>;

			nvidia,dvfs-tbl = < 204000000 204000000 204000000  408000000
                                                    204000000 204000000 408000000  666000000
                                                    204000000 408000000 666000000  1066000000
                                                    408000000 666000000 1066000000 2133000000 >;

            nvidia,max-speed = <4>;
            nvidia,disable-aspm-states = <0xf>;
            nvidia,controller-id = <&bpmp 0x5>;
            nvidia,tsa-config = <0x0200b004>;
            nvidia,disable-l1-cpm;
            nvidia,aux-clk-freq = <0x13>;
            nvidia,preset-init = <0x5>;
            nvidia,aspm-cmrt = <0x3C>;
            nvidia,aspm-pwr-on-t = <0x14>;
            nvidia,aspm-l0s-entrance-latency = <0x3>;

            bus-range = <0x0 0xff>;
            ranges = <0x81000000 0x0 0x3a100000 0x0 0x3a100000 0x0 0x00100000      /* downstream I/O (1MB) */
                      0x82000000 0x0 0x40000000 0x1f 0x40000000 0x0 0xC0000000     /* non-prefetchable memory (3GB) */
                      0xc3000000 0x1c 0x00000000 0x1c 0x00000000 0x3 0x40000000>;  /* prefetchable memory (13GB) */

            nvidia,cfg-link-cap-l1sub = <0x1c4>;
            nvidia,cap-pl16g-status = <0x174>;
            nvidia,cap-pl16g-cap-off = <0x188>;
            nvidia,event-cntr-ctrl = <0x1d8>;
            nvidia,event-cntr-data = <0x1dc>;
            nvidia,margin-port-cap = <0x194>;
            nvidia,margin-lane-cntrl = <0x198>;
            nvidia,dl-feature-cap = <0x30c>;
    };

Thank you in advance for your prompt answer,
Chris

Can someone from NVIDIA please help?

Hi,

I reviewed thread again, I don’t think you need to change BAR address to 32-bit in DT.

While the 18AI32SSC1M can support 64-bit addresses the board is configured for 32-bit addresses, which is what the driver is designed to support.

In Xavier, PCIe controller’s 32-bit address space is limited to 30MB only. So, to support larger BAR sizes, we are using 64-bit CPU address and 32-bit PCI address with PCIe internal address translation unit. What this means is, driver running in Xavier will get 64-bit BAR address(CPU address), but when sending the request to PCIe device, this address is translated to 32-bit address. So, with existing DT settings any PCIe device with only 32-bit BAR support can also work. Just update the device driver running in Tegra to use 64-bit CPU address.

However, if you still insist on having both CPU & PCI address as 32-bit then update the “ranges” property as below.
In this setting max supported 32-bit non-prefectable BAR is 30MB.

pcie@141a0000 {

ranges = <0x81000000 0x0 0x3a100000 0x0 0x3a100000 0x0 0x00100000 /* downstream I/O (1MB) /
0x82000000 0x0 0x3a200000 0x0 0x3a200000 0x0 0x1E00000 / non-prefetchable memory (3GB) /
0xc3000000 0x1c 0x00000000 0x1c 0x00000000 0x3 0x40000000>; /
prefetchable memory (13GB) */

}

Note: Above DT setting is for C5. Refer to the table I posted in previous comment and update the required controller setting accordingly.

Thank you. I’ll work with the manufacturer and update this thread with the results.

Hi Manikanta,

I made the change to the dtsi that you suggested and flashed only the device tree. It looks like the only thing that changed was the memory address for Region 0 and 2 (from 1f40000000 to 3a200000), but they still show as [disabled]. Below are the outputs from running: sudo lspci -s 0005:02:00.0 -vv

Before:

0005:02:00.0 Signal processing controller: PLX Technology, Inc. PCI9056 32-bit 66MHz PCI <-> IOBus Bridge (rev ac)
	Subsystem: PLX Technology, Inc. PCI9056 32-bit 66MHz PCI <-> IOBus Bridge
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at 1f40000000 (32-bit, non-prefetchable) [disabled] [size=512]
	Region 1: I/O ports at 300000 [disabled] [size=256]
	Region 2: Memory at 1f40000200 (32-bit, non-prefetchable) [disabled] [size=512]
	Capabilities: [40] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [48] CompactPCI hot-swap <?>
	Capabilities: [4c] Vital Product Data
pcilib: sysfs_read_vpd: read failed: Input/output error
		Not readable

After:

0005:02:00.0 Signal processing controller: PLX Technology, Inc. PCI9056 32-bit 66MHz PCI <-> IOBus Bridge (rev ac)
	Subsystem: PLX Technology, Inc. PCI9056 32-bit 66MHz PCI <-> IOBus Bridge
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at 3a200000 (32-bit, non-prefetchable) [disabled] [size=512]
	Region 1: I/O ports at 300000 [disabled] [size=256]
	Region 2: Memory at 3a200200 (32-bit, non-prefetchable) [disabled] [size=512]
	Capabilities: [40] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [48] CompactPCI hot-swap <?>
	Capabilities: [4c] Vital Product Data
pcilib: sysfs_read_vpd: read failed: Input/output error
		Not readable

How do I enable the 32-bit non-prefetchable memory?

Here is the output from sudo lshw:

*-pci:2
          description: PCI bridge
          product: NVIDIA Corporation
          vendor: NVIDIA Corporation
          physical id: 0
          bus info: pci@0005:00:00.0
          version: a1
          width: 32 bits
          clock: 33MHz
          capabilities: pci pm msi pciexpress msix normal_decode bus_master cap_list
          configuration: driver=pcieport
          resources: irq:39 ioport:300000(size=4096) memory:3a200000-3a3fffff
        *-pci
             description: PCI bridge
             product: Tundra Semiconductor Corp.
             vendor: Tundra Semiconductor Corp.
             physical id: 0
             bus info: pci@0005:01:00.0
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
             resources: memory:3a300000-3a300fff ioport:300000(size=4096) memory:3a200000-3a2fffff
           *-generic UNCLAIMED
                description: Signal processing controller
                product: PCI9056 32-bit 66MHz PCI <-> IOBus Bridge
                vendor: PLX Technology, Inc.
                physical id: 0
                bus info: pci@0005:02:00.0
                version: ac
                width: 32 bits
                clock: 66MHz
                capabilities: pm hotswap vpd cap_list
                configuration: latency=0
                resources: memory:3a200000-3a2001ff ioport:300000(size=256) memory:3a200200-3a2003ff

UPDATE: I can now build and load the driver, so this is progress! Also, after the driver is loaded, the output from lspci no longer shows [disabled] for Region 0/1/2.

Still having trouble running the sample applications from the manufacturer, but it sounds like I need to work with them on finding the exact issue.

Thank you for the support thus far.

Manikanta,

Making progress and I believe the last issue may be the assignment of a valid interrupt. Per the manufacturer’s feedback:

Check the output from dmesg. There should be a message from the driver reporting that IRQ 0 is invalid. It should look something like the following.

 18ai32ssc1m: os_irq_open(): invalid IRQ: 0

This is reflected in the “lspci -vv” output, which includes the following.

 Interrupt: pin A routed to IRQ 0

This indicates that the kernel has NOT assigned an interrupt to the board. It sounds like another feature the kernel isn’t configured to support, which is Legacy Interrupts. They may be named with varying terms, but “Legacy Interrupts”, “INTX” interrupts and “Pin Based Interrupts” are common. My guess is that there is another configuration option you’ll have to change, followed by another kernel rebuild.

I appreciate a quick response on this!

Can someone please help solve the issue: “Interrupt: pin A routed to IRQ 0”? If we can’t get a quick resolution to this, we are going to have to abandon the NVIDIA AGX as our platform for this project. Thank you.

Anyone from NVIDIA have a solution?

How can I configure the kernel to assign a valid interrupt to the PCIe card?

Hi,

Legacy INTX on Tegra is verified and working. There is no need to make any changes(DT or kernel).
In the endpoint client driver, legacy interrupt can be retrieved from “struct pci_dev->irq”.

Thanks,
Manikanta

Manikanta,

Thank you, but here is what the manufacturer is saying:

I already use that structure. That is the only way our current drivers would be working on Intel hosts. The problem is that the kernel isn’t making a legacy interrupt assignment. This is verifiable using the below command and looking at the “Interrupt:” line for the 18AI32SSC1M, which should be the very last device entry in the data dump. The interrupt reported for “pin A” is “0”, which is not valid. A value of zero indicates that an assignment has not been performed.

Hi,

I see that “signal processing controller” is under another “PCI bridge”, I believe it is PCIe to PCI/PCI-X bridge?
If you see the interrupt pin of Tegra PCIe root port it is 39.
resources: irq:39 ioport:300000(size=4096) memory:3a200000-3a3fffff
If you connect a direct endpoint then you would see irq as 39 for endpoint as well.
However, since it there is another PCI bridge, it is playing a role in interrupt pin setting.

INT pin number is vital in PCI/PCI-X protocol, but for PCIe protocol there is no significance because legacy INTX is message for it, there is no wired interrupt connection over the bus.
SW driver running on Tegra can get the irq number from “struct pci_dev->irq” and register interrupt for it, and when Tegra PCIe receives INTA message then it will assert the same irq line(39 in this case).

Since PCIe to PCI bridge connected to Tegra, you have to check with the manufacturer on how wired interrupt pin numbers are handled.

Thanks,
Manikanta

Manikanta,

Thank you for sticking with this. It feels like we are so close to getting this to work. Here is the response from the manufacturer:

The AGX devices shown by running lspci -vv include
a “Pin A routed” entry with a valid IRQ number. Unfortunately, the IRQ
number reported for the 18AI32SSC1M is zero. That is, the “struct
pci_dev->irq” structure field for the board’s interrupt is zero. I know
this for certain for two reasons. First, the lspci output reports a
value of zero. Second, the driver uses the “struct pci_dev->irq” content
directly when requesting installation of the ISR. The request fails with
a value is zero. The driver performs a check on this field, and if zero,
posts a message to the system log and returns an error status. In the
driver, the field used is “dev->pci.pd->irq”. This can be seen in the
driver sources in os_irq.c starting at about line 43.

The author of the email you quoted is suggesting that the presence of
the PCIE to PCI bridge, the Tundra chip, negates the need for an
interrupt to be assigned to the device behind the bridge, the
18AI32SSC1M. This is incorrect and is contrary to the PCIE
specification. It is also contrary to practice. The Tundra chip does in
fact translate wired INTX interrupts to corresponding PCI Express
messages. Those messages are handled by kernel drivers which end up
calling the ISR from drivers for the corresponding interrupts assigned
to the respective PCI devices. We have been making use of this feature
since PCIE to PCI bridge chip became available. As a simple test for
this email trail, I installed a similar PMC board on an identical Tundra
based adapter in one of my desktop PCs and ran the lspci -vv command.
The output revealed that an interrupt had been assigned to the PMC
board, just as it should be done by the NVIDIA system for the
18AI32SSC1M. IMHO it loops like there may be a bug in the kernel port to
the AGX processor board.

Hi,

My point is, for PCIe it doesn’t matter what is programmed in config space register interrupt_line. Only thing it matters is if valid irq is present in “struct pci_dev->irq” or not. As I said in my earlier response, we tested this feature with PCIe based usb2.0 card which uses legacy INTX in upstream ehci-hcd class driver.

We are using standard DT way to setup legacy intx and upstream PCIe driver parses it from DT.
Ref: Device Tree Usage - eLinux.org
DT setting in Tegra:
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0 0 0 0 &intc 0 53 0x04>;
Upstream driver function which parses legacy intx irq: pci_assign_irq()
All are upstream Linux implementation, there nothing Tegra specific implemented here.
If you think there is an issue in legacy intx irq allocation, you can review the upstream implementation.

To quickly test legacy int-x, you can also hardcode it to 85(53+32) instead of using pdev->irq.

Thanks,
Manikanta