PCI devices unavailable after reboot with usecase7.1 config

Software Version
DRIVE OS Linux 5.2.0

Target Operating System
Linux

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)

SDK Manager Version
1.4.0.7363

Host Machine Version
native Ubuntu 18.04

Reproduction steps:

  1. Flash usecase7.1.pmc config to PCI switch as described in the documentation.
  2. Perform a full power cycle.
  3. PCI devices are available.
  4. reboot
  5. PCI devices aren’t available.

Before reboot:

nvidia@tegra-ubuntu:~$ lspci
0000:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0000:01:00.0 PCI bridge: PMC-Sierra Inc. Device 8534
0000:02:00.0 PCI bridge: PMC-Sierra Inc. Device 8534
0000:02:01.0 PCI bridge: PMC-Sierra Inc. Device 8534
0000:03:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3)
0000:04:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)

After reboot:

nvidia@tegra-ubuntu:~$ lspci
0000:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)

Is this a known issue? Is there a workaround such that PCI devices are properly enumerated after a software reboot when using this configuration?

Hi @raul.tambre ,

I haven’t tried to flash configuration file for use case 7.1. Please check your current configuration file and firmware and see if they correspond to the requirement in ~/nvidia/nvidia_sdk/DRIVE_OS_5.2.0_SDK_Linux_OS_DDPX/DRIVEOS/drive-t186ref-foundation/firmwares/bin/common/microsemi/pcie/bins/e3550/config_files/pm8534/README. Thanks.

The firmware version is as required in the README.

0x00000000:0002>version
Firmware Version        Major 01. Minor 08. Type 0. Build D58.
Device Id               8534
Device Revision         1
XML File Version        58

Please also share the output of running "gasrd 0x2010 1” in PCIe switch console to check current configuration file. Thanks.

0x00000000:0000>gasrd 0x2010 1
gas_reg_read <0x2010> [1]
0x40007101

I can reproduce it on my side. I’ll check internally and update you. Thanks.

Hi @raul.tambre ,

Last time, I did reproduce this issue by flashing the configuration and then doing a power cycle.
But for checking other issues, I reflashed my DRIVE OS 5.2.0.
When we went back to debug this issue, I cannot reproduce this issue anymore.

I flashed 7.1 with below command.

nvidia@tegra-ubuntu:/lib/firmware/pcie-switch/tools/prebuilt$ sudo ./switchtec fw-update /dev/switchtec0 /lib/firmware/pcie-switch/e3550/config_files/pm8534/usecase7.1.pmc

Even after a few power cycles, I still saw correct pci devices listed as below.

nvidia@tegra-ubuntu:~$ lspci
0000:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0000:01:00.0 PCI bridge: PMC-Sierra Inc. Device 8534
0000:02:00.0 PCI bridge: PMC-Sierra Inc. Device 8534
0000:02:01.0 PCI bridge: PMC-Sierra Inc. Device 8534
0000:03:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3)
0000:04:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
0001:01:00.0 3D controller: NVIDIA Corporation Device 1eba (rev a1)
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)

Could you try to flash back the configuration file of use case 4.0 and reflash DRIVE OS 5.2.0 and see if it will help on the problem? Thanks.

Please ignore my previous post.
I reproduced the issue again after issuing “sudo reboot now” and then noticed you mentioned it clearly in your first post.

1 Like

Hi @raul.tambre ,

We have found the root cause and will fix the issue in usecase7.1.pmc.
I’ll update you once confirm which release will have the fix. Thanks.

1 Like

@raul.tambre ,

If this fix is urgent for you, please let us know the reason. Thanks.

We have fixed the issue and it will be in a future release (but too late for the next release). Sorry for any inconvenience.

1 Like

In which future release will the fix be available?

Hi @raul.tambre ,

This fix is missing in the upcoming release because we didn’t get your reply previously. For further release plan, could you help to discuss this with your nvidia rep? Thanks.

Since you said this was too late for the next release I assumed that meant DRIVE OS 5.2.6 and now that that’s out it’d be known whether it’ll be in 5.2.9 or 5.2.12. I asked since every time I encounter this I’m reminded of this thread. While annoying, certainly not worth wasting our rep’s time on.

Yes, I was talking about too late for upcoming DRIVE OS 5.2.6.
Where did you hear 5.2.9 or 5.2.12?
I’m not clear if there will be further devzone release or any other release you will get beyond devzone.
That’s why I asked you to check with your nvidia rep.

5.2.12 and 5.2.15 are indicated in the roadmap we have from Quanta for V3NA.

👀

If you will get any release after DRIVE OS 5.2.6 (e.g. 5.2.9), it will have the fix. Thanks.

That’s exactly what I wanted to confirm, thanks. 🙂

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.