TX2 pcie does not detect endpoint on ConnectTech carrier board

Hi,

Little context info:
I created a Yocto image using a TX2 module with the Jetson reference board. On this board I’ve installed a PCIe cameralink frame grabber (EB1) from EPIX. I’ve finished my image, everything work I’m happy.

Now, I want to move to a Astro carrier from ConnectTech. The frame grabber is an EB1-mini. Same drivers, nothing changes.

In my build system, I have merged the kernel sources provided by Connect Tech, created a new yocto machine (astro-tx2) and set the device tree to astro-revG+. My image builds, everything works except the grabber which is not detected.

lspci lists only the second network adapter connected on port 0. The grabber card is connected on port 2

00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)
01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

The second bridge is missing. I should see this

00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)
00:03.0 PCI bridge: NVIDIA Corporation Device 10e6 (rev a1)
01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
02:00.0 Unassigned class [ff00]: Epix Inc Device eb21 (rev 01)

EPIX informed me that the grabber takes roughly 200ms to be ready to communicate with the PCIe bus controller. Speculating that it could be a timing issues I attempted few things:

Setting nvidia,boot-detect-delay in the device tree

pcie-controller@10003000 {
    status = "okay";
    nvidia,boot-detect-delay = <1000>;
    pci@1,0 {
        nvidia,num-lanes = <2>;
        status = "okay";

    };
    pci@2,0 {
        nvidia,num-lanes = <1>;
        status = "disabled";
    };
    pci@3,0 {
        nvidia,num-lanes = <1>;
        status = "okay";
    };
};

Looking at the dmesg timestamp, and after trying with 10s, I can see the delay is applied. But no changes.

I attempted to rescan the devices but this has no effect.

echo 1 > /sys/bus/pci/rescan

The default Connect Tech kernel configuration has pci-tegra enabled in kernel. I reconfigured to build it as an external module which I then load when the device has booted

cd /lib/modules/4.4.38-l4t-r28.2+g374c531/kernel/drivers/pci/host
insmod pci-tegra.ko

From the console output I can see the detection still fails

root@tegra-tx2:/lib/modules/4.4.38-l4t-r28.2+g374c531/kernel/drivers/pci/host# dmesg | grep pci
[    1.172137] GPIO line 459 (pcie-lane2-mux) hogged as output/low
[    1.232689] iommu: Adding device 10003000.pcie-controller to group 50
[  122.580668] tegra-pcie 10003000.pcie-controller: 2x1, 1x1, 1x1 configuration
[  122.886944] tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
[  122.893748] tegra-pcie 10003000.pcie-controller: PCIE: Waiting for 1000ms
[  123.902604] tegra-pcie 10003000.pcie-controller: probing port 0, using 2 lanes
[  123.911952] tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
[  125.946310] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[  127.954220] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[  129.962369] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[  129.970211] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring

You can see the 3 retries on link 2 followed by an ignore.

I made sure pcie-lane2-mux status is OK.

cd /sys/firmware/devicetree/base/gpio@2200000/pcie0_lane2_mux
cat status
okay

Then I started tweaking pci-tegra.

I increased the TEGRA_PCIE_LINKUP_TIMEOUT to 1000ms.

I also tried adding a 1000ms sleep after the power rails are enabled in tegra_pcie_enable_regulators

static int tegra_pcie_enable_regulators(struct tegra_pcie *pcie)
{
	int i;
	PR_FUNC_LINE;
	if (pcie->power_rails_enabled)
		return 0;

	pcie->power_rails_enabled = 1;
	dev_info(pcie->dev, "PCIE: Enable power rails\n");

	for (i = 0; i < pcie->soc_data->num_pcie_regulators; i++) {
		if (pcie->pcie_regulators[i])
			if (regulator_enable(pcie->pcie_regulators[i]))
				dev_err(pcie->dev, "%s: can't enable regulator %s\n",
				__func__,
				pcie->soc_data->pcie_regulator_names[i]);
	}

    dev_info(pcie->dev, "PCIE: Waiting for 1000ms\n");
    msleep(1000);

	return 0;

}

Since all these modifications did not help, I decided to install JetPack’s Ubuntu. There the grabber card is detected properly.

So I removed all the modification I did in my build system, rebuilt my kernel and devices tree, and flashed those only keeping the Ubuntu system image. There the grabber card is detected properly.

I then reflashed my custom image, and overwrite it with the prebuilt kernel from Connect Tech’s BSP, which works with Ubuntu. There the grabber card detection fails.

Finally I compared the kernel command lines.

My custom image has

console=ttyS0,115200 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.2.1 androidboot.serialno=0421118034785 bl_prof_dataptr=0x10000@0x277040000 sdhci_tegra.en_boot_part_access=1 root=/dev/mmcblk0p1 rw rootwait vmalloc=256M cma=128M coherent-pool=96M

JetPack’s Ubuntu has:

root=/dev/mmcblk0p1 rw rootwait console=ttyS0,115200n8 console=tty0 OS=l4t fbcon=map:0 net.ifnames=0 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 androidboot.serialno=0421118034785 bl_prof_dataptr=0x10000@0x277240000 sdhci_tegra.en_boot_part_access=1 root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4

After re-ordering for comparison it looks like this, first line my custom image, second line JetPack Ubuntu

console=ttyS0,115200   memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.2.1 androidboot.serialno=0421118034785 bl_prof_dataptr=0x10000@0x277040000 sdhci_tegra.en_boot_part_access=1 root=/dev/mmcblk0p1 rw rootwait vmalloc=256M cma=128M coherent-pool=96M
console=ttyS0,115200n8 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6                                            androidboot.serialno=0421118034785 bl_prof_dataptr=0x10000@0x277240000 sdhci_tegra.en_boot_part_access=1 root=/dev/mmcblk0p1 rw rootwait                                         rootfstype=ext4 console=tty0 OS=l4t fbcon=map:0 net.ifnames=0

The vmalloc=256M cma=128M coherent-pool=96M is something I need for the grabber card. I did try removing it but it has no effect.

I have this addition which I don’t understand boot.slot_suffix= boot.ratchetvalues=0.2.1 Could someone explain to me what this is for?

The I don’t thing the console, OS, frame buffer and networking configurations have anything to do with PCIe.

I’m completely out of idea. Basically the same kernel image and device tree blob work with Ubuntu but now my custom Linux distribution. How can the distro affect the way PCIe devices are detected?

Thanks,
-Damien

Hi Damien, official support is provided for the reference Ubuntu/JetPack distro — there can be many differences among the others like Yocto.

Have you tried contacting ConnectTech Support to see what they think?

Hi Dusty,

I did try to contact them but haven’t got any answer. That’s why I turned to this forum.

Would you be able to explain what these parameters are for boot.slot_suffix= boot.ratchetvalues=0.2.1?

Thanks,
-Damien

OK, I will ping them to verify that they have seen your case.

I’m personally unaware of what these represent, perhaps another poster has knowledge.

I got the detection working.

I had set a wrong ODMDATA configuration in my machine definition.