Unable to get CEM card to enumerate

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Our setup:
We have a board that routes miniSAS to oculink and a board that routes oculink to CEM. Both of these boards are simply routing the signals, there are no retimers on the boards. We can verify with a continuity tester that the correct signals go all the way from the miniSAS connector to the CEM connector. Right now we are just focused on getting CEM cards to enumerate. We have tried an EVB-LAN7430 (https://www.microchip.com/en-us/development-tool/EVB-LAN7430) and EVB-PCI11414 (https://www.microchip.com/en-us/development-tool/EV38E94A). We have tried these two specifically because they both use the LAN743X driver which we know is loaded because the AGX Orin has an internal LAN7431 chip.

What we have tried/verified:
-The first thing we did was tie PERST# high and CLKREQ# low on the miniSAS → oculink board to eliminate the possibility of the sideband signals causing the issue. We have kept this configuration for all tests we have done since then.
-We connected a single ended scope probe to the Orin TX/CEM RX lines (connected on the CEM side) and were able to verify that we saw an RX.detect pulse upon startup of the system. When we connected that same scope to the REFCLK lines we saw no activity. We are using a single ended probe to verify the signals are there, we are not trying to characterize the signals
-We were able to verify that we can get two Orins to talk to each other through the miniSAS cable using this tutorial: Chip to Chip Communication | NVIDIA Docs.
-We tried echo 1 > /sys/bus/pci/rescan and modprobing the RP driver mentioned in the tutorial to try to get the CEM card to enumerate.
-The last thing we tried was providing our own REFCLK to our CEM cards to see if we could get it to enumerate.

Under none of these tests were we able to get either CEM card to enumerate.

Dear @christopher.brisco,
May I know the used DRIVE OS release and what functionality you want to test with the asked setup?

This is what I am running. I am not sure whether this correlates to rev 1 or the SDK version in the checklist above
image

Dear @christopher.brisco,

Could you clarify what is the functionality you want to check as I need to check with core team if it is supported from DRIVE OS

Dear @christopher.brisco,
Could you provide any update?

Hello, I am working with @christopher.brisco on this task and it is still non-functional.

Objective: Successful link with SRNS PCIe Switch (Microchip Switchtec Gen4 PM42100) at any of x1 x2 or x4 width.

Here are the facts about our issue:

  • DRIVE AGX Orin PCIe settings/devtree are factory default

  • DRIVE AGX Orin only provides external connectivity through the proprietary MiniSAS connections (Port A - Upper, and Port B - Lower), hereforth regarded “NVIDIA-MiniSAS”.

  • DRIVE AGX Orin MiniSAS pinouts are not documented by NVIDIA but rather by Amphenol.

  • DRIVE AGX Orin uses PCIe-CEM-like interconnections internally which are not documented anywhere we can find, therefore we cannot interface our product to DRIVE AGX Orin this way.

  • We created NVIDIA-MiniSAS to OCuLink adapter based on Amphenol cable specification in order to connect to known-good OCuLink → CEM adapter which interfaces to Microchip Switchtec PCIe Gen4 (PM42100) switch.

  • DRIVE AGX Orin does not source any 100MHz REFCLK through the specified NVIDIA-MiniSAS positions A1 & B1

  • We were informed via email (from rganapathi) that DRIVE AGX Orin does not support PCIe common clock, and only supports SRIS/SRNS - we conclude that to connect any basic NVMe or NIC Endpoint , we must now put SRIS/SRNS-capable PCIe switch (PM42100) between DRIVE AGX Orin and any basic endpoint

  • DRIVE AGX Orin Port A does not exhibit any RX Detect signalling at all, and PERST# is driven low by DRIVE AGX Orin

  • DRIVE AGX Orin Port B DOES exhibit RX Detect signalling, and PERST# is pulled high by DRIVE AGX Orin. Port B appears to be the default functional PCIe port, which is showing some normal behavior except no common clock, and we proceed to only use Port B.

  • With our custom NVIDIA-MiniSAS to OCuLink adapter adapter, the DRIVE AGX Orin Port B TX0p/n and RX0p/n have electrical continuity from NVIDIA-MiniSAS to our CEM slot at the correct positions B14/B15 and A16/A17 respectively.

  • With our custom NVIDIA-MiniSAS to OCuLink adapter adapter, the DRIVE AGX Orin Port B PERST# has electrical continuity to our CEM slot at the correct position A11

  • DRIVE AGX Orin Port B never appears to exit RX Detect state (neither with a protocol analyzer termination, nor a common-clock endpoint, nor an SRNS endpoint (PCIe switch P2P bridge), even after manually applying 50Ohm terminations to force link training.

  • PCIe Analyzer confirms there is zero PCIe-encoded traffic on the bus at any point in time.

Is there any information here that can help us? We have been unable to make any progress.

Dear @christopher.brisco,

Drive AGX Orin mini sas PCIe connector are design for system expansion to allow two or four developer systems being connected for achieving a much higher system performance.
Mini-SAS connector for other PCIe device is out of DRIVE AGX Orin product defined scope.

For everyone else here: We figured out that its possible to unplug the PCIe NIC which has the Mini-SAS on it and to use the spare PCIe x16 slot to plug in our CEM card.

Thanks

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.