M.2 E PCIe data link down

Hi,

We are designing our own carrier for xavier SOM. We are facing an issue on M.2 E.
We duplicate the circuit from xavier-devkit for M.2 E and M.2 M.

The SSD plugged in M.2 M key slot works. But the wifi plugged in M.2 E key is NOT detected!

We flashed devkit with our pinmux, kernel, and device-tree. And wifi is detected on devkit.
Therefore, most likely the issue is on the hardware related to M.2 E on our carrier.

This is the kernel dmesg:

[   12.460344] tegra-pcie-dw 14140000.pcie: Setting init speed to max speed
[   12.461215] OF: PCI: host bridge /pcie@14140000 ranges:
[   12.975562] tegra-pcie-dw 14140000.pcie: link is down
[   12.975789] tegra-pcie-dw 14140000.pcie: PCI host bridge to bus 0003:00
[   12.979469] tegra-pcie-dw 14140000.pcie: PCIe link is not up...!

There are two things that I realized that may cause issue:

  1. the AC coupling capacitors on UPHY TX lines are 0.22 uF, whereas devkit is using 0.1 uF. I am not really sure this could cause any issue for GEN1 or GEN2 pcie. On OEM, it says it can be 0.1 uF or 0.22 uF. So, it seems both 0.1 uF and 0.22 uF should work for GEN1/GEN2. 0.22 uF is for GEN3/GEN4

  2. There is a OSC 32.768KHz oscillator. When we probe it, it seems it is oscillating faster than 32 KHz. We need to double check its oscillating frequency. Is 32KHz oscillator being used by wifi module connected on M.2 E key?

  3. It seems data link of PCIe is down. I think nvidia pcie driver get data link status from PCIE_X1_RC_PF0_PCIE_CAP_LINK_CONTROL_LINK_STATUS_REG_0 (offset 0x80 from configuration space). I think bit 29 indicates the data link layer status of pcie. What would be the cause of pcie data link down?

  4. Note that the USB on M.2 E is not working either. The wifi also has USB interface for Bluetooth function. But it is NOT detected either on USB bus. So, may be the issue is caused by some reset or power signals on M.2 E, because USB is not working either.

  5. I probed W_DISABLE1# and W_DISABLE2#, and they are 3.3V. I set them output drive 1 in pinmux and gpio-hog output-high in device-tree. May be I will also probe the 3.3V power pins on M.2 E. Any other pins that I should check?

Regards.

  • Please use 0.22 uF unless you have a strong reason to go for 0.1 uF.
  • I’m not sure about the dependency of 32K oscillator on getting the PCIe link up. Well, if it is not correct, I think we should fix it anyway.
  • There could be multiple reasons for it. If there are issues with routing of any of REFCLK lanes, PERST# lane, Tx/Rx could result in linkup failure. Please make sure that all the above pins are routing correctly. Make sure that PERST# is not stuck to ‘0’
  • Since USB interface is also not working, I suspect power or reset-related issues. Need to check from that perspective
  • If W_DISABLE1# and 2# are at 3.3V, we are good. Make sure that they stay there all the time and at no point are they going to ‘0’. Ideally, this should only disable the Radio interface of the chip and not necessarily the PCIe interface, but we have seen in many cards that asserting these signals would bring the PCIe interface also down.

Hi,

Thanks for your reply.

For 1: we are using 0.22 uF, so we should be fine. But why devkit is using 0.1 uF on M.2 E TX lane?

For 3: PERST# on M.2 E is 3.3V. We haven’t probe REFCLK lanes. We probed CLKREQ# few days ago just with a multimeter, I remember it was 3.3V. I should probe it again together with REFCLK lanes. Is REFCLK keep generated all the time? Or only during the LOW period of CLKREQ# signal? If we probe REFCLK lane, what would be the expect frequency?

Also, who set the link status bit (bit 29) in PCIE_X1_RC_PF0_PCIE_CAP_LINK_CONTROL_LINK_STATUS_REG_0 (offset 0x80 from configuration space)? Is it the PCIe controller on tegra SoC? How does PCIe controller probe PCIe data link status? Is it done by tx and rx some packets?

For 4: Is there a reset signal for USB on M.2 E? It looks like PERST#, REFCLK, and CLKREQ# are for PCIe not used for USB, correct?

I have one additional question:
The AC coupling capacitors are only needed on pcie TX lanes, and they should be placed close to the TX end, correct? The reason why we don’t have AC coupling capacitors on RX lanes is because the wifi module or SSD that plugged in M.2 connector they already have AP coupling capacitors on their TX lanes, correct?

Regards.

Hi @vidyas

I got WiFi on M.2 E working. It was the xtal causing the issue. After replaced with a 32KHz xtal, it works.

I have another unrelated question.

Can one multi-lanes PCIe controller on Xavier drive more than one PCIe devices?

For example, pcie controller C5 (pcie@141a0000) has 8 pcie lanes (8 nvhs UPHYS).
Intel I210 is a single pcie lane to single ethernet port adapter, which only uses one pcie lanes.

Can I put two (or even eight) I210 controlled by pcie controller C5? Will this work?
If not, do I need something like a pcie-mux?

Thanks.

I’m afraid that doesn’t work. You may have to connect a PCIe switch to C5 in this case and connect how many ever I210s you want downstream of that PCIe switch.

Hi @vidyas

Thanks for you reply and information.
It seems PEX8605 may work for us:
https://docs.broadcom.com/doc/12351821

Or the better GEN4 version PEX88064.

Do I need specific driver for pcie switch (PEX8605, or other pcie switch)?
If so, does linux-4.9.140 (the version in nvidia L4T) have driver support for it?

Thanks.

Usually, PCIe switches don’t need any special drivers.

Hi @vidyas

Have you tested pcie switch with Xavier? If so, is pex switch (e.g. 8718, 8605, 8606, etc) tested in your list?

Is PCIe switch handled by pci bridge driver in linux? For example, if I have two port downstream, it will have one virtual upstream pci-bridge (for CPU to switch), and two virtual downstream pci-bridge (for switch to PCI endpoint)?

Thanks.