NVIDIA Jetson AGX Xavier in Endpoint Mode turns off after booting when connected by PCI to a host PC

ablanquezp · March 24, 2025, 11:09am

Hi,

We are very new to this technology and we are trying to learn. We have an NVIDIA Jetson AGX Xavier flashed in Endpoint Mode as explained in Jetson AGX Xavier PCIe Endpoint Mode — Jetson Linux<br/>Developer Guide 34.1 documentation), and we are trying to connect it to a x64 CPU PC as the Root Complex.

We are using NVIDIA Jetson Linux 35.6.0 in the Jetson and Ubuntu 18.04 in the PC.

The Jetson board works without connecting the PCIe to the PC, but it turns off just a few seconds after booting when it is connected to the PC via PCIe. We are using the cable ADT-Link PCI express 3.0 x4 Jumper Cable R22NS (PCIe x4 Jumpers Extension Cable). Sometimes there is not even time to login.

As a side note, we have to turn on the PC but we don’t let the OS to boot until we turn on the Jetson. Otherwise, the Jetson does not even turn on. The Jetson is always externally connected to power by the power supply included in the kit.

Any clue is welcome,
Thank you.

WayneWWW · March 25, 2025, 4:15am

Sorry that I am not quite in the situation.

Are you saying that your Jetson is not able to boot up if there is PCIe connection with host PC?

ablanquezp · March 25, 2025, 8:10am

Yes, that is the case. The are three different situations:

The PC and the Jetson are not connected via PCIe. In this case, both works as expected.
The Jetson is connected to the host PC via PCIe and the PC is not turned on. In this case, the Jetson is not able to boot up. We see the power led turned on for only a few seconds. We see some movement in the PC’s fans, so we think the Jetson is trying to turn on the PC.
The Jetson is connected to the host PC via PCIe and the PC is turned on. In this case, the Jetson is able to boot up, but it turns off a few seconds after the Linux user login. We don’t see anything in the journalctl or any other log we have checked.

Thank you for your answer.

WayneWWW · March 25, 2025, 8:12am

Ok. So none of the case here is “reboot”. They are all powered down?

ablanquezp · March 25, 2025, 8:14am

Correct, we power off both the PC and the Jetson and disconnect the PCIe cable between any of our tries to make it work.

sgursal · March 25, 2025, 9:55pm

Also, please refer to Jetson AGX Xavier Series PCIe Endpoint Design Guidelines Application Note and Jetson AGX Xavier PCIe Endpoint Software available through search in Jetson download center for further guidance Jetson Download Center | NVIDIA Developer

ablanquezp · March 25, 2025, 10:13pm

We have checked both documents. We cannot go beyond step 1.4 PCI Endpoint Software due to the issue we are describing.

Regarding the Design Guidelines, the cable we linked above should be doing all the Tx-Rx connections. The only aspect we are not sure about is this section:

“The mux should be set to select PEX_CLK5_N/P if the Jetson AGX Xavier will be the Root Port or NVHS_SLVS_REFCLK_P/N if it will be the Endpoint”

Could a misconfiguration of that causes our issue? Could you provide further guidance on how to make sure the mux selects NVHS_SLVS_REFCLK_P/N? Are we missing something about the connection or the mux configuration?

sgursal · March 25, 2025, 11:06pm

Refer to Xavier Developer Kit carrier Board schematics. PCIe clock Mux is controlled through GPIO6_PEX_REFCLK_SEL and Mux truth table is in schematics. Jetson Download Center | NVIDIA Developer

ablanquezp · April 4, 2025, 5:13pm

We have selected NVHS_SLVS_REFCLK_P by the following process and we still have the same issue.

First, we generated a dst files from the NVIDIA Jetson Xavier Pinmux spreadsheet, modifying the Req Initial State of the row with description “PEX_REFCLK_SEL” from Drive 0 to Drive 1.
Then, we generated a config file by using the dst files as input for the Python script present Kernel pinmux folder.
After that, we move the generated config file to the BTC folder in the bootloader repository.
Finally, we flashed the Jetson by using the flash script.

The behavior is the same and the Jetson is turned off when is connected to the PC.

Are we missing something?

WayneWWW · April 7, 2025, 3:57pm

Actually you don’t need to do that. Our default software already did everything that you should only need to update the ODMDATA and software side will handle the pinmux and dtb by itself.
Unless you are totally not using our default dtb.

ablanquezp · April 7, 2025, 9:03pm

We are following the instructions detailed in Jetson AGX Xavier PCIe Endpoint Mode — Jetson Linux<br/>Developer Guide 34.1 documentation for a Jetson AGX Xavier with NVIDIA Jetson Linux 35.6.0.

All the software we are using have been downloaded from NVIDIA website and have not been modified.

WayneWWW · April 8, 2025, 12:04pm

Just to clarify that the document should be this one.

If there is no error in the dmesg, then software is probably fine.

ablanquezp · April 10, 2025, 3:26am

We have rolled back all changes and only modified jetson-xavier.conf to specify ODMDATA as the document states. We get the same behavior.

I attach the output of dmesg without connecting the Jetson to the PC (the Jetson is turned off before I can open a terminal when it is connected to the PC via PCIe).
dmesg.txt (79.9 KB)

I see the following lines in the output of dmesg, which can be a problem:
[ 4.425550] tegra194-pcie 141a0000.pcie_ep: Adding to iommu group 8
[ 4.427781] tegra194-pcie 141a0000.pcie_ep: Failed to get PERST GPIO: -517
[ 4.427796] tegra194-pcie 141a0000.pcie_ep: Failed to parse device tree: -517

Is that expected by following these instructions using all the default software? Should we try an older version of Jetson Linux?

WayneWWW · April 10, 2025, 6:17am

Hi,

This error is actually weird. Could you check if your device tree has “reset-gpios” under pcie_ep@141a0000? I read the default one and it is indeed there.

ablanquezp · April 12, 2025, 9:33pm

Yes, it is there. The content is the following:

reset-gpios = <0x0b 0xd9 0x01>;

I can provide the device tree if needed, we have not made any change on it.

WayneWWW · April 13, 2025, 3:17am

Then could you go to your driver and check why this line cannot get the reset gpios?

ablanquezp · April 22, 2025, 6:14pm

We have not implemented any driver. What driver are you referring to?

WayneWWW · April 23, 2025, 2:54am

Above “141a0000.pcie_ep” is the driver we provided but we don’t see any of such print on our side when we tested devkit. Also, you mentioned “reset-gpios” are there in your device tree.

If device tree has it, then such error print shall never happen. That is why I asked you to go to that driver and check why this thing got error even though GPIO node is present.

The driver is kernel/kernel-5.10/drivers/pci/controller/dwc/pcie-tegra194.c

ablanquezp · April 23, 2025, 1:42pm

I see the following code inside the file you pointed to, in the function tegra_pcie_dw_parse_dt, which is the only place where that error message is present:

pcie->pex_rst_gpiod = devm_gpiod_get(pcie->dev, "reset", GPIOD_IN);
if (IS_ERR(pcie->pex_rst_gpiod)) {
    int err = PTR_ERR(pcie->pex_rst_gpiod);
    const char *level = KERN_ERR;

    if (err == -EPROBE_DEFER)
        level = KERN_DEBUG;

    dev_printk(level, pcie->dev,
            dev_fmt("Failed to get PERST GPIO: %d\n"),
            err);
    return err;
}

The function is called from the probe function in the platform_driver struct tegra_pcie_dw_driver. The error message in case of failure confirm this is the code we are looking for:

ret = tegra_pcie_dw_parse_dt(pcie);
if (ret < 0) {
    const char *level = KERN_ERR;

   if (ret == -EPROBE_DEFER)
       level = KERN_DEBUG;

   dev_printk(level, dev,
            dev_fmt("Failed to parse device tree: %d\n"),
            ret);
}

Checking devm_gpio_get_index, the error can only come from gpiod_get_index. It is not easy for us to see where the issue comes from in gpiod_get_index without those dev_dbg logs. We are using everything by default, we have not modified the kernel. Should we check any specific code? Enabling debug logging? In that case, how?

WayneWWW · April 23, 2025, 3:20pm

Hi,

I just checked this again but still sounds not reasonable to me.

The error no here is -517. It means it is “EPROBE_DEFER”. So this error here is because the PCIe driver is probed too early. It is earlier than the GPIO driver.

If you read your dmesg, you would notice GPIO driver starts later then the pcie ep driver.

However, when the err is EPROBE_DEFER, the log level should be set to KERN_DEBUG and your dev_printk shall not print it unless you ever changed the loglevel of your dmesg.

Also, if this is EPROBE_DEFER, then kernel shall probe pcie driver again later. However, it seems not happening on your side.

Topic		Replies	Views
Xavier NX PCIE Endpoint Mode Jetson Xavier NX pcie , boot	47	1991	April 11, 2023
xavier pcie endpoint problem Jetson AGX Xavier	6	1515	October 18, 2021
How to configure pcie endpoint mode on jetson orin nx Jetson Orin NX pcie , board-design	20	312	February 9, 2025
Jetson Xavier AGX 32GB flashing fails Jetson AGX Xavier reflash	22	3957	October 18, 2021
PWM cannot be enabled Jetson AGX Xavier gpio	26	2095	January 26, 2023
Jetpack5.0.2 Xavier pcie endpoint mode Jetson AGX Xavier pcie , nvbugs	20	2084	October 4, 2022
Bricked Jetson Xavier AGX Jetson AGX Xavier boot	18	1031	September 11, 2023
Nvidia Jetson AGX Xavier USB 3.0 Not Working Jetson AGX Xavier usb	15	1637	August 10, 2022
PCIe ep Test Fail on AGX orin:RP DMA address is null .Version:R36.3 Jetson AGX Orin pcie	14	139	November 21, 2024
Jetson stuck before boot with black screen when PCI-e device is connected Jetson AGX Xavier pcie	34	480	April 24, 2024

NVIDIA Jetson AGX Xavier in Endpoint Mode turns off after booting when connected by PCI to a host PC

Related topics