XILINX PCIE NOT WORKING ON XAVIER with JETSON 5

Xavier is not detecting FPGA, while same FPGA is being detected on Windows PC.

1. XDMA Driver — Installed Successfully

Built and installed the Xilinx DMA IP driver (xdma.ko) from source against the 5.10.216-tegra kernel headers. The module loads cleanly but creates no /dev/xdma* device nodes because the FPGA is never enumerated.

2. Xavier Max Mode — Set

sudo nvpmodel -m 0
sudo jetson_clocks

3. PCIe Controller Status at Boot

The Xavier has 6 PCIe controllers. Only C1 (14100000) links up (to the onboard Marvell SATA controller). The x16 slot controllers C5 (14180000) and C0 (141a0000) both report:

tegra194-pcie 14180000.pcie: Phy link never came up
tegra194-pcie 141a0000.pcie: Phy link never came up

C0 also reports:

tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517

This is a deferred probe (EPROBE_DEFER) — C0 eventually probes but still fails link training.

4. Boot Race Condition

A timing problem exists at boot:

  • C5 probes at t=4.3s and gives up at t=5.4s

  • C0 doesn’t even start probing until t=6.9s

  • C0 controls the slot power regulators (vpcie3v3, vpcie12v) for the M.2 slot and has a PERST# GPIO (nvidia,plat-gpios)

  • This means C5 attempts link training before C0 has initialized

5. PCIe Pinmux — All Unclaimed

pin 197 (PEX_L5_CLKREQ_N_PGG0): (MUX UNCLAIMED) (GPIO UNCLAIMED)
pin 198 (PEX_L5_RST_N_PGG1):    (MUX UNCLAIMED) (GPIO UNCLAIMED)

PEX_L5_RST_N (PERST# for the x16 slot) was found as input LOW in the GPIO debug output — meaning the FPGA was being held in reset. The tegra194-pcie driver controls this internally during its probe sequence.

6. DTB Modifications Made

All changes made to /boot/dtb/kernel_tegra194-p2888-0001-p2822-0000.dtb:

a) PCIe regulators — added regulator-always-on

regulator@115 {
    regulator-name = "vdd-3v3-pcie";
    regulator-boot-on;
    regulator-always-on;   /* ADDED */
};
regulator@116 {
    regulator-name = "vdd-12v-pcie";
    regulator-boot-on;
    regulator-always-on;   /* ADDED */
};

b) PEXCLK pad controller — enabled

pinctrl@3790000 {
    compatible = "nvidia,tegra194-pexclk-padctl";
    status = "okay";   /* was: "disabled" */
};

Note: Even though the node is now okay in the live device tree, there is no tegra194-pexclk-padctl driver in the L4T 5.10 kernel — the device registers but never binds to a driver.

c) nvidia,max-speed — reduced from Gen4 to Gen3

This is our current/latest change, not yet tested:

/* C5 (14180000) — x16 slot */
nvidia,max-speed = <0x03>;   /* was: 0x04 (Gen4 / 16GT/s) */

/* C0 (141a0000) */
nvidia,max-speed = <0x03>;   /* was: 0x04 (Gen4 / 16GT/s) */

The key observation: C1 (the only working controller) has nvidia,max-speed = 2 (Gen2). C5 and C0 both had nvidia,max-speed = 4 (Gen4 / 16GT/s). The ZU47DR only supports Gen3 (8GT/s) maximum. We believe the Tegra194 PCIe driver pre-configures the PHY equalization settings based on nvidia,max-speed before starting link training, and Gen4 PHY configuration is electrically incompatible with a Gen3-only endpoint causing the PHY to never achieve link even at the physical layer.

Just curious. So on the hardware side, which PCIe are you using for this FPGA?

I am not sure why you need to care about multiple controllers at same time. Is there more than one FPGA connected there?

There is only one FPGA. The FPGA is Gen3 x4. I wanted to try everything before posting on this forum. Now i reflashed the jetson 5 for a fresh start. But still its not detecting it.

Just to align to be on the same page.

Please answer just these questions first

  1. Are you using a NV devkit or a custom board?
  2. On which PCIe controller are you using there? There should be just one controller in use with corresponding hardware. For example, if C1 is in use for Marvell SATA controller, then it will always be C1. C0/C4/C5 won’t ever touch it.

If you don’t know the meaning of the question, please tell. Do not give a answer which you are uncertain.

I am not certain. I will give you high level idea.
I am using Jetson XAVIER, and connecting FPGA on its PCIE Slot directly.

No, this kind of answer and photos basically answer nothing…

Everyone in this forum is using a Jetson Xaiver.

How did you flash your board there?

Xavier is flashed using Nvidia SDK Manager 2.4 on EMMC.
FPGA is flashed using QSFP

Ok. Then probably that slot is C5.

Please follow this debug tips to make the power not going to be down first.

Now the PCIE is not disabled upon restart. However the link is still not established. I probed PCIE A13 Prest pin.
As soon as Xavier starts it goes from 0 to 3V then as soon the black screen appears it goes back to 0V instead of staying high.

Please let me know if you need to see any log.

xavier@ubuntu:~/Desktop$ echo “1” | sudo -S dmesg | grep -i “xdma|pcie|fpga|xilinx” | tail -60
[sudo] password for xavier: [ 0.465684] FPGA manager framework
[ 4.399526] tegra194-pcie 14180000.pcie: Adding to iommu group 8
[ 4.406772] tegra194-pcie 14180000.pcie: host bridge /pcie@14180000 ranges:
[ 4.407212] tegra194-pcie 14180000.pcie: IO 0x0038100000..0x00381fffff → 0x0038100000
[ 4.407595] tegra194-pcie 14180000.pcie: MEM 0x1800000000..0x1b3fffffff → 0x1800000000
[ 4.407890] tegra194-pcie 14180000.pcie: MEM 0x1b40000000..0x1bffffffff → 0x0040000000
[ 5.517741] tegra194-pcie 14180000.pcie: Phy link never came up
[ 5.518289] tegra194-pcie 14180000.pcie: PCI host bridge to bus 0000:00
[ 5.551624] pcieport 0000:00:00.0: Adding to iommu group 8
[ 5.553653] pcieport 0000:00:00.0: PME: Signaling with IRQ 24
[ 5.555351] pcieport 0000:00:00.0: AER: enabled with IRQ 24
[ 5.563933] tegra194-pcie 14100000.pcie: Adding to iommu group 9
[ 5.569561] tegra194-pcie 14100000.pcie: host bridge /pcie@14100000 ranges:
[ 5.570050] tegra194-pcie 14100000.pcie: IO 0x0030100000..0x00301fffff → 0x0030100000
[ 5.570424] tegra194-pcie 14100000.pcie: MEM 0x1200000000..0x122fffffff → 0x1200000000
[ 5.570750] tegra194-pcie 14100000.pcie: MEM 0x1230000000..0x123fffffff → 0x0040000000
[ 5.676856] tegra194-pcie 14100000.pcie: Link up
[ 5.688053] tegra194-pcie 14100000.pcie: PCI host bridge to bus 0001:00
[ 5.800437] pcieport 0001:00:00.0: Adding to iommu group 9
[ 5.805424] pcieport 0001:00:00.0: PME: Signaling with IRQ 26
[ 5.810965] pcieport 0001:00:00.0: AER: enabled with IRQ 26
[ 5.817341] tegra194-pcie 14140000.pcie: Adding to iommu group 10
[ 5.824806] tegra194-pcie 14140000.pcie: host bridge /pcie@14140000 ranges:
[ 5.829140] tegra194-pcie 14140000.pcie: IO 0x0034100000..0x00341fffff → 0x0034100000
[ 5.837922] tegra194-pcie 14140000.pcie: MEM 0x1280000000..0x12afffffff → 0x1280000000
[ 5.846220] tegra194-pcie 14140000.pcie: MEM 0x12b0000000..0x12bfffffff → 0x0040000000
[ 6.961224] tegra194-pcie 14140000.pcie: Phy link never came up
[ 6.961617] tegra194-pcie 14140000.pcie: PCI host bridge to bus 0003:00
[ 6.979244] pcieport 0003:00:00.0: Adding to iommu group 10
[ 6.980154] pcieport 0003:00:00.0: PME: Signaling with IRQ 28
[ 6.981402] pcieport 0003:00:00.0: AER: enabled with IRQ 28
[ 6.985608] tegra194-pcie 141a0000.pcie: Adding to iommu group 11
[ 6.986830] tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517
[ 7.655758] tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517
[ 8.078177] tegra194-pcie 141a0000.pcie: host bridge /pcie@141a0000 ranges:
[ 8.079354] tegra194-pcie 141a0000.pcie: IO 0x003a100000..0x003a1fffff → 0x003a100000
[ 8.087698] tegra194-pcie 141a0000.pcie: MEM 0x1c00000000..0x1f3fffffff → 0x1c00000000
[ 8.096063] tegra194-pcie 141a0000.pcie: MEM 0x1f40000000..0x1fffffffff → 0x0040000000
[ 9.208972] tegra194-pcie 141a0000.pcie: Phy link never came up
[ 9.209739] tegra194-pcie 141a0000.pcie: PCI host bridge to bus 0005:00
[ 9.230423] pcieport 0005:00:00.0: Adding to iommu group 11
[ 9.231410] pcieport 0005:00:00.0: PME: Signaling with IRQ 30
[ 9.232324] pcieport 0005:00:00.0: AER: enabled with IRQ 30
[ 11.155202] xdma: loading out-of-tree module taints kernel.
[ 11.156245] xdma: module verification failed: signature and/or required key missing - tainting kernel
[ 11.158939] xdma:xdma_mod_init: Xilinx XDMA Reference Driver xdma v2020.2.2
[ 11.159590] xdma:xdma_mod_init: desc_blen_max: 0xfffffff/268435455, timeout: h2c 10 c2h 10 sec.

Does your black screen mean after the boot logo?

Yes Sir !! Boot screen

I am pretty sure the issue is related to this PCIE Reset. XAVIER is providing power to FPGA continously but as soon as Boot screen appears the PCIE Reset A11** Pin goes to 3V then goes back to ~0V. It need to stay on 3V otherwise FPGA will stay in RESET mode since its active low reset.

The issue is resolved, thank you for your help. It was C5 Controller which was making issue. I followed below steps and now it can detect PCIE on XILINX FPGA:
ODMDATA 0x69190000 — correct bits 29:30 for NVHS UPHY PCIe C5

  1. Device Treedisable-power-down on all controllers, enable C5, regulators always-on

  2. Flash command

  3. PCIe rescan — manual + systemd service for automatic boot-time rescan

  4. XDMA driver auto-load

I am uncertain why you need to mention this here as C5 ODMDATA setting should be correct since the beginning.

I downloaded fresh files from Jetson Linux | NVIDIA Developer

And in the file p2972-0000.conf.common.
The value was: ODMDATA=0x9190000;

And it failed to boot with below error:
FATAL ERROR [FILE=platform/drivers/uphy/uphy-tegra194.c, ERR_UID=475]: nvhs_uphy_pll_init failed
Bootstrap@0x501d1444 sp 0x50183e80 stack: 501820cc - 50183ff

After modifying the value to: ODMDATA=0x69190000;

The issue was resolved. I just added what i did so that in future if someone stumble upon similar situation they can follow. I dont want to upset you, maybe i am wrong. I am just sharing what i did. Good Day Mr. Wayne :)

Who is the exact vendor of the device you are using?

Changing ODMDATA here seems not quite reasonable. This ODMDATA change would make some of other USB/PCIe devices not able to work.

Actually there is no support for Xavier to change ODMDATA…

My FPGA device is from XILINX 47DR, it has Gen3 x4 PCIE.

For other people, pls try without changing ODMDATA hehe !!
I did what i must do, now my PCIE is working like butter.

No, I mean who is the vendor of the Jetson Xavier board…

If a ODMDATA change is required here, then it might not be a NV dekvit.

Just to confirm. So if nothing is connected (no FPGA) on your board, then even flashing it with sdkamanger won’t be able to boot up? Hit the same bpmp error as you posted?

I am attaching picture, you can check the vendor is NVIDIA.

Anyway, in order for XAVIER to detect FPGA at boot i added below script.

```bash

sudo tee /etc/systemd/system/pcie-c5-fpga-reset.service << ‘EOF’

[Unit]

Description=PCIe C5 FPGA power-cycle and rescan

After=multi-user.target

[Service]

Type=oneshot

ExecStart=/usr/local/bin/pcie-c5-fpga-reset.sh

RemainAfterExit=yes

[Install]

WantedBy=multi-user.target

EOF

sudo systemctl enable pcie-c5-fpga-reset.service

```