PCIE Lane Reversal on C5 and C0

Hi Folks -

We are dealing with a PCIE lane reversal issue and would appreciate your input. This question is similar to the posts below, but we have not had success with lane reversal:

We do not see lane reversal work cleanly. We are using a test setup with two devkits, one configured as Rootport and one configured as Endpoint. We are using Jetpack 4.6.

Straight Thru
When we use a straight thru crosslink (a rigid, passive PCBA with appropriately impedance matched routing) that maps Lane 0 to Lane 0, Lane 1 to Lane 1, etc, we close a link as described in all the various documentations and on this forum. The link status is reports as 16 GT/s and x8 width using lspci -vv. This appears to be working as specified.

Lane Reversal
When we use a crossover PCBA that maps Lane 0 to Lane 7, 1 → 6, 2 → 5, 3 → 4, 4 → 3, 5 → 2, 6 → 1, and 7 → 0, it does not work.

When we modify the pcie driver to assert TX_LANE_FLIP_EN RX _LANE _FLIP _EN and AUTO_LANE_FLIP_CTRL_EN before releasing LTSSM_EN, it does not work.

When we modify the pcie driver to assert the PRE_DET_LANE register to 0b011, (corresponding to an x8 lane reversal according to the TRM,) the link closes, but only at x1.

Configurations
We have tried several configurations, a few interesting ones are enumerated here:

Description with Straight PCBA with Lane Reversed PCBA
Unmodified x8 16GT/s FAILS
PRE_DET_LANE = 0b011, enables x1 16 GT/s x1 16 GT/s
PRE_DET_LANE = 0b000, enables x8 16GT/s FAILS
only PRE_DET_LANE = 0b011 x1 16 GT/s x1 16 GT/s

We have also tried some C0 to C5 links with similar results. We can get an x1 link to close, but we cannot get an x4 link to close.

Questions
We’ve been through the forums, online searches, and “PCI Express Technology” by Jackson and Budruk. We are clearly missing something, but have yet to figure out the magic incantation to get link training to recognize lane reversal.

  1. Has anyone successfully made a lane reversed crossover between two devkits work?
  2. Are there any more diagnostics we can run that will shed light on the problem? We’ve tried to parse lspci -vv but we do not see anything relevant to lane reversal.
  3. Are there any suggestions for procedure changes, code changes, or tests we can run?

Thanks in advance for reading this lengthy post, and we very much appreciate your input.

Best regards,
sw

Sorry for the late response, our team will do the investigation and provide suggestions soon. Thanks

Hi @kayccc -
Any word on this topic? We would like to spin our board and this remains an issue.

Best regards,
sam

Sorry for the late response due to can’t have team to give the suggestion quickly.
Is this still an issue to support and need mote input?

Thanks

C0 and C5 both support x8 lanes. To enable lane reversal (Lane 0 to Lane 7, 1 → 6, 2 → 5, 3 → 4, 4 → 3, 5 → 2, 6 → 1, and 7 → 0), could you try to set below regs?

AUTO_LANE_FLIP_CTRL_EN = 1
PRE_DET_LANE = 0
NUM_OF_LANES = 8

@ni6 - Thank you very much for your suggestion. Now it is my turn to apologize for the late response.

For clarity, I enclose the modified code:

git diff 59207135f kernel/nvidia/drivers/pci/dwc/pcie-tegra.c
diff --git a/kernel/nvidia/drivers/pci/dwc/pcie-tegra.c b/kernel/nvidia/drivers/pci/dwc/pcie-tegra.c
index 21269df59..ddea81fd9 100644
--- a/kernel/nvidia/drivers/pci/dwc/pcie-tegra.c
+++ b/kernel/nvidia/drivers/pci/dwc/pcie-tegra.c
@@ -67,6 +67,8 @@
 #define APPL_CTRL_HW_HOT_RST_MODE_IMDT_RST     0x1
 #define APPL_CTRL_SYS_PRE_DET_STATE            BIT(6)
 #define APPL_CTRL_LTSSM_EN                     BIT(7)
+#define APPL_CTRL_RX_LANE_FLIP_EN       BIT(16)
+#define APPL_CTRL_TX_LANE_FLIP_EN       BIT(17)
 #define APPL_CTRL_HW_HOT_RST_EN                        BIT(20)
 
 #define APPL_INTR_EN_L0_0                      0x8
@@ -274,6 +276,9 @@
 
 #define DL_FEATURE_EXCHANGE_EN         BIT(31)
 
+#define STATUS_L1LTSSM_REG_0        0x254
+#define LANE_REVERSAL               BIT(15)
+
 #define PORT_LOGIC_ACK_F_ASPM_CTRL                     0x70C
 #define ENTER_ASPM                                     BIT(30)
 #define L0S_ENTRANCE_LAT_SHIFT                         24
@@ -286,6 +291,15 @@
 
 #define PORT_LOGIC_GEN2_CTRL                           0x80C
 #define PORT_LOGIC_GEN2_CTRL_DIRECT_SPEED_CHANGE       BIT(17)
+#define AUTO_LANE_FLIP_CTRL_EN      BIT(16)
+#define PRE_DET_LANE_0              BIT(13)
+#define PRE_DET_LANE_1              BIT(14)
+#define PRE_DET_LANE_2              BIT(15)
+#define NUM_OF_LANES_0              BIT(8)
+#define NUM_OF_LANES_1              BIT(9)
+#define NUM_OF_LANES_2              BIT(10)
+#define NUM_OF_LANES_3              BIT(11)
+#define NUM_OF_LANES_4              BIT(12)
 #define FTS_MASK                                       0xFF
 #define FTS_VAL                                                52
@@ -2851,6 +2932,31 @@ static int tegra_pcie_dw_host_init(struct pcie_port *pp)
 
        clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
 
+       if(pcie->cid == 5) {
+        /* Attempt to force Lane Reversal */
+        val = readl(pci->dbi_base + PORT_LOGIC_GEN2_CTRL);
+        dev_info(pcie->dev, "Changing GEN2_CTRL from %u \n", val);
+        val &= ~PRE_DET_LANE_0;
+        val &= ~PRE_DET_LANE_1;
+        val &= ~PRE_DET_LANE_2;   
+        val |= AUTO_LANE_FLIP_CTRL_EN;   
+        val &= ~NUM_OF_LANES_0;
+        val &= ~NUM_OF_LANES_1;
+        val &= ~NUM_OF_LANES_2;
+        val |=  NUM_OF_LANES_3;
+        val &= ~NUM_OF_LANES_4;
+        dev_info(pcie->dev, "Changing GEN2_CTRL to %u \n", val);
+        writel(val, pci->dbi_base + PORT_LOGIC_GEN2_CTRL);
+        
+        /* Enable manual lane reversal */
+        /*
+        dev_info(pci->dev, "Setting manual lane reversal");
+        val = readl(pcie->appl_base + APPL_CTRL);
+        val |= APPL_CTRL_RX_LANE_FLIP_EN;
+        val |= APPL_CTRL_TX_LANE_FLIP_EN;
+        writel(val, pcie->appl_base + APPL_CTRL);
+        }
+       */
        /* assert RST */
        val = readl(pcie->appl_base + APPL_PINMUX);
        val &= ~APPL_PINMUX_PEX_RST;

The output of dmesg on the rootport includes the lines:

dmesg | grep GEN2
[    5.403878] tegra-pcie-dw 141a0000.pcie: Changing GEN2_CTRL from 198708 
[    5.403883] tegra-pcie-dw 141a0000.pcie: Changing GEN2_CTRL to 198708 

In other words, the GEN2 CTRL register was:
AUTO_LANE_FLIP_CTRL_EN = 1
PRE_DET_LANE = 0
NUM_OF_LANES = 0b01000 = 8

Our modified code did not actually change the GEN2 control register values; the register was the same before and after.

THIS DID NOT WORK.. The PCIE link did not close with the cross over PCBA. (The straight through PCBA did close at x8.)

The only thing that seems to help with closing a link with reversed lanes is PRE_DET_LANE.

Do you have any other debugging steps I can try?

Thanks,
sam

Hi,

Actually, lane reversal should be supported by default. Is it not working on your side?

@waldman, please don’t write manual lane reversal regs as (Lane 0 to Lane 7, 1 → 6, 2 → 5, 3 → 4, 4 → 3, 5 → 2, 6 → 1, and 7 → 0) is supported as default.

@Ni6, @WayneWWW -

I know that lane reversal is supposed to work by default, but we don’t see it working.

We have tried all combinations of register writes – including the defaults – and we don’t see lane reversal working. The only thing that “works” is changing “PRE_DET_LANE” – and this only works at x1. (Please see the table in the original post.)

Can you confirm that DevKit to DevKit with a lane reversed PCB works on your end?
Can you offer any trouble shooting guidance about the lower level PCIe registers that might indicate what is going wrong? I know the lane enumeration and polarity is all hardware level, perhaps there are some messages that can shed light on our situation?

Thanks so much,

sam