AGX Xavier: No PCIe link

Hi

We created a custom carrier board based on the dev-kit carrier board (all IO is be the same, no special stuff of any kind).
After flashing it with SDK manager (1.9.3.10904) with jetpack 5.1 (rev1), we lost the PCIe connection.
‘lspci’ shows no rootcomplex of any kind.
There is a FPGA connected on the other side to the PCIe bus, at least I expect a PCIe root complex to be there in my Linux…

Please see attached dmesg-log.
dmesg.log (75.8 KB)

Hi,

It is common that FPGA needs more delay to get detected. Please follow this page.

https://docs.nvidia.com/jetson/archives/r35.4.1/DeveloperGuide/text/HR/JetsonModuleAdaptationAndBringUp/JetsonAgxOrinSeries.html?highlight=pcie#debug-pcie-link-up-failure

And add “nvidia,disable-power-down” to your pcie controller device tree.

This will bring the PCIe RP back. And you can try to rescan/ bind/unbind it again to see if it can detect your FPGA.

The FPGA is configured even before the nVidia module starts.
Our powersequencer makes sure of that

Then you still need to follow the document and check the items there.

It is basic debug process. No matter your FPGA is ready or not, jetson side check points are same.

We know for certain the FPGA is up and running, we know it acts as an endpoint since it is the same FPGA with the same coniguration as we used before on a PCIe extention card with the same hardware.
The only difference here is the devkit vs module.

Hi,

Sorry, I don’t get the point you want to say in this comment.

“The only difference here is the devkit vs module.”

Could you elaborate more about it?
You only tell us lots of “we know that”, “already know that”… .etc. So what is the exact question you want to ask if you already knew something?

Are you using PCIe C5 as root port? or something else? Does your FPGA get detected on devkit but fail on your custom board?

What could be the reason why someting on the devkit works but not with the module?
There is like not real documentation what the differences are except for the schematic…

All hardware is based on the reference schematic of the nVidia carrier board. We do not use any “special” pin configuration. FPGA configuration is the same with the same hardware and design to PCIe.
On the devkit everything works.

Using a scope we see a clock for a small time, and then it is gone. We use not customized Jetpack as a reference for this issue. Just because of the lack of documentation.,

Hi,

I guess you are talking about “working on devkit but not working on custom board” , right?

Your term “not with the module” is a mistake because xavier devkit also uses xavier module…

Module is not a variable here. Your base board is the variable…

I am not hardware guy so I can only suggest you the software method to check.

Also, I don’t know you or your board. Don’t know what test you’ve done. You should at least tell me what PCIe controller you are using.

We’ve heard lots of such comment “my board is same as devkit” from other users before, but turns out minor difference leads to problem…

You could compare your dmesg on specific controller on devkit case and custom board case first and see if they have any difference.

Also, I remember that you asked similar question before.

Did you remember to apply this patch to your board?

This was on a customized jetpack version.
It did solve the issues then (and will certainly do when we go the custom way).

Be it the carrier board, we added all needed signals and lanes for the PCIe (with exception of the x16 and JTAG/SPI signals). We use a x8 configuration.
At the moment 2 hardware engineers are looking into it here.

They suspect certain pins are used by the nVidia carrier board for signals while we did not connect them because they are not needed according to the PCIe standard. (eg GPIO18 I heard them say)

So are you going to share any useful info here to clarify?

I really don’t get what you want us to help. If you suspect a hardware problem on your custom board, then at least share your schematic.

Your heard GPIO18 from them? Why not just share schematic here so that we can figure it out directly but not a second-hand info from someone else I don’t even know?

Well, we can not just share schematics due to the fact this is all public.
Our supplier/distributor only directs us to this forum for any help.
(although they always promise to look into it and it remains silent for weeks…)

And in all fairness you guys provide support more quickly and overall better.

The PEX_L5_CLKREQ_N signal is connected to GND on our board while on the nVidia carrier, it isn’t.
Also on the nVidia carrier board there are other PCIe devices on the bus wich keep the bus up. While on our carrier board the only PCIe device is our FPGA.
The FPGA is up and running before even booting the nVidia module. We wait for the FPGA to configure before setting the XAV_PERIPH_RESET.

Also on the nVidia carrier board there are other PCIe devices on the bus wich keep the bus up.

Actually there is no other PCIE devices on the bus on devkit.

root@tegra-ubuntu:/home/nvidia# lspci
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
root@tegra-ubuntu:/home/nvidia# cat /etc/nv_boot_control.conf
TNSPEC 2888-400-0004-N.0-1-2-jetson-xavier-

This is the result from devkit. There are several PCIe controller on jetson and they are all independent.

The 0001:xxxxx devices are on C1. Which is not same as the x8 slot you are using.
Your problem is on C5. As C5 and C1 are not related to each other, your “There are several PCIe controller on jetson and they are all dependent” is wrong. C5 is not enabled even on NV devkit case if there is no device connected.

The pin should be left unconnected if not used.
Please share PCIe part schematic, otherwise it is hard to debug if any schematic issue.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.