PCIe clock rates and power management

Hi Trumany,

We’ve already tried near field probing to uncover the source of the desense. However, the sensitivity of our GPS receiver is quite low (-167dBm for NEO-M8N, https://www.u-blox.com/sites/default/files/NEO-M8-FW3_DataSheet_(UBX-15031086).pdf) when compared to the noise floor of our near field probe + spectrum analyzer. So the aggressor is below the noise floor of what we can measure. We can very clearly see the spike at 1.25GHz for the PCIe Gen1 signalling, but there is nothing at 1.575GHz above the noise floor.

I agree that moving the GPS antenna further from the aggressor would be a good solution, but while we may end up there eventually, it’s not acceptable for the overall product requirements today.

Regarding tuning the PCIe PHY values, I’d like to note that we aren’t suggesting this would be a production solution for the GPS issue, especially since the signal integrity impact on cards we may wish to use in the future is unknown. Rather, tuning would help us see if the PCIe_TX or PCIe_RX lines are the source (as is our current hypothesis).

Here are some other knowns based on recent experiments:
-There is no interference whatsoever when using GLONASS instead of GPS (around 1.598 to 1.606GHz). [Signal generator set to broadcast GLONASS only, receiver set to track GLONASS only]
-On the NVIDIA Jetson carrier board, we can install our card in the M2 slot (using a mPCIE to M2 adapter) and can reproduce the desense. Adding copper tape to completely enclose the card and adapter, all the way back to and including the M2 connector, eliminates the desense.
-On the NVIDIA carrier board, changing from TX1 to TX2 has no change in desense (this experiment used the x4 PCIe connector with an adapter to mPCIe since at the time we didn’t have the commands to change the TX2’s MUX for PEX1 vs USB_SS0 to get this to work in the M2 slot)
-Desense reproduces across 3 tested systems, suggesting this isn’t just a problem with one unit.
-Disabling unused PCIe slots or clocks has no impact on desense
-Independently enabling PCIe clocks but without a card in the slot has zero desense (suggesting that there must be traffic on TX/RX lanes for there to be desense). We also tried the same experiment with a card present but prevented enumeration of the card. Again, there is zero desense.

All of this is leading us to believe the desense is either because of the traffic or other noise on PCIE_TX/RX lines, or because of the state of the TX1 or card after enumeration occurs.

Finally, we’re expecting the PCIe Gen2 cards we’ve ordered to arrive Monday so will have more updates soon. We’ve also ordered a half-size minicard to better test the hypothesis about the size of the ground plane in relation to the wavelength of GPS.

Thanks for your continued support and ideas. Please let us know if you have more ideas for us to try. We are continuing to work through our own list of experiments too.
Best,
MikeB

1.575GHz - 1.25GHz = 325MHz. I am curious if there is a 325MHz spike?

Hi Mike,

Do you get the result of Gen2 card and half-size card?

Hi Trumany,

We tried a few cards this week:

  1. Mini PCIe half-size card: a similar level of GPS desense is still present.

  2. M2 WLAN card. This connects directly to the M2 slot on the Jetson TX1 devkit. There was some desense in this mode, but it was not as significant as the mini PCIe cards we have tried.

  3. Gen2 PCIe card. This one’s a little interesting. There was significant GPS desense while the system is still in u-boot during startup. I cannot yet confirm if the card was running in Gen1 or Gen2 mode from u-boot.

After the linux-kernel is fully loaded the desense goes away in both 5 GT/s and 2.5 GT/s mode.

The Gen2 card is a USB-3.0 to PCIe adapter, however the USB ports are currently non-functional. I believe it needs an external power supply that I’m still in the process of setting up. I suspect linux may be disabling some functionality (a power-saving mode, maybe?) if no devices are connected after initialization. I will update you after resolving this problem.

We did have another WLAN card intended to test PCIe 2.0 functionality. Qualcomm’s website indicates that the WLAN module supports PCIe 2.1, but integrated card (from another supplier) didn’t seem to support it. The PCIe link capabilities register only indicated support for Gen 1 speeds.

So you can neither move the GPS module nor enclose mPCIe card, seems the final solution is to use a appropriate Gen2 card, right?

Hi Trumany,

Unfortunately the solution isn’t as simple as choosing an off the shelf Gen2 card. Our product utilizes a semi-custom card, which we could re-design, but we need conclusive evidence that Gen2 solves the issue. So far we do not have that (eg: we can’t reproduce desense in Linux at Gen1 speeds yet, so there is no reason to conclude that Gen2 speeds reduce/solve desense (yet).) We believe there to be a software configuration solution at Gen1 speeds and recent evidence points in that direction (see below).

The card from Experiment 3 above is a USB to PCIe card, which plugs into the full-size PCIe slot on the Jetson board. On the other hand, our product uses a mini PCIe form factor wireless card. The card itself is semi custom. We’ve also tried several other mPCIe wireless cards with similar desense results. So we have yet to find a mini PCIe card which does not exhibit GPS desense.

We also have some more interesting results to report. Continuing with the experiments using the USB to PCIe (experiment 3 above), we probed the PCIe_TX and REFCLK lines with a high speed oscilloscope to see if there were any differences between U-boot and Linux. We discovered that there were no measurable differences on the TX lines (eye diagrams looked identical). However, for REFCLK, we observed:
-U-boot REFCLK is 99.80MHz with a standard deviation of 291kHz over approximately 23 thousand triggers. Peak peak differential voltage is ~400mV.
-Linux REFCLK is 99.80MHz with a standard deviation of 148kHz over approximately 23 thousand triggers. Peak peak differential voltage is ~1000mV.

It looks like the spread spectrum parameters are different between the two states. Again in Linux, we observed no desense, while in U-boot we do. We have observed differences in the configuration of PLL-E between Linux and U-Boot (CLK_RST_CONTROLLER_PLLE_SS_CNTL_0) and are looking at changing those in Linux to see if we can cause the desense there. However it’s not clear why this card has no desense but others do under Linux. One theory is that the measuring antenna is further from the device under test, so it’s not necessarily a case of “no desense” as much as it is “less desense.” We’ll test that theory.

That said, we are still very interested in the originally requested data on this thread. Can you please assist us with documentation for changing the amplitude of the PCIe TX and RX signals? We would also be interested in any feedback on Robb’s post from 03/30/2017. (eg: is this the correct way to enable/disable spread spectrum and modify the TX amplitude)

One thing you might want to verify is that spread sprectrum should only occupy the lower side-band, and is not altering the signal to ever occur at a faster clock rate than the default non-ss rate (faster would imply upper side-band). I could easily see altering clocks both above and below center clock as causing the nearly double standard deviation. Should the larger deviation be entirely in the lower side-band I don’t think it would be an error (as a rule of thumb I think any SS within 1 to 3% of primary frequency is considered valid).

We will try to reproduce this with USB to PCIe card on TX1 devkit on r24.2.1. Is your card uPD720201 or uPD720202?

Our card is a uPD720201. We are using SD-PEX20139.

We have done a few more experiments with u-boot and linux (a follow-up from MikeB226’s last post):

The configuration differences noted earlier for PLLE are incorrect. I was looking at the wrong source file. Instead, I have confirmed that the following PLLE registers match exactly between u-boot and linux. This was done by directly dumping the following registers:

  • CLK_RST_CONTROLLER_PLLE_SS_CNTL_0
  • CLK_RST_CONTROLLER_PLLE_MISC1_0
  • CLK_RST_CONTROLLER_PLLE_AUX_0
  • CLK_RST_CONTROLLER_PLLREFE_BASE_0
  • CLK_RST_CONTROLLER_PLLREFE_MISC_0
  • CLK_RST_CONTROLLER_PLLREFE_OUT_0

Given that both linux and u-boot seemingly have the same configuration but there are differences in the level of GPS desense, it could provide an avenue to mitigation.

Hi Robb/Mike, so the clock of Gen2 in u-boot and kernel is identical but the level of GPS desense is different?

We have figured out why the desense between u-boot and linux are so different. It turns out that our Gen2 PCIe card was auto-switching into the ASPM L1 state in linux but not in u-boot. Our WLAN cards did not show similar behaviour. After disabling ASPM by setting the kernel parameter pcie_aspm=off, we can easily observe GPS desense in linux on the Jetson carrier board with the TX1.

The level of GPS desense appears unchanged when the Gen2 card is toggled between Gen1 and Gen2 link speeds.

We’ve also decided to re-investigate the spread spectrum configuration. Our previous attempt (comment #7) configured the SS registers only on kernel initialization. This time, we decided to simply write to the SS register after everything has booted. This allowed us to toggle SS on and off dynamically when testing.

To disable SS we wrote 0x23011c21 to CLK_RST_CONTROLLER_PLLE_SS_CNTL_0.

To re-enable SS we wrote 0x23010021 to CLK_RST_CONTROLLER_PLLE_SS_CNTL_0.

We decided to test with the USB-PCIe Gen2 card (configured in the ASPM L0 mode) in the Jetson devkit, and with our half-size WLAN Gen1 card in our custom carrier board. The desense measurements are:

  • USB3-PCIe, baseline (all devices are off): 50 dB

  • USB3-PCIe, Gen1 link speed, SS on: 28 dB

  • USB3-PCIe, Gen1 link speed, SS off: 34 dB

  • USB3-PCIe, Gen2 link speed, SS on: 28 dB

  • USB3-PCIe, Gen2 link speed, SS off: 34 dB

  • Half-size WLAN, baseline (all devices are off): 40 dB

  • Half-size WLAN, Gen1 link speed, SS on: 34 dB

  • Half-size WLAN, Gen1 link speed, SS off: 35 dB

Since the USB3-PCIe and Half-size WLAN cards are connected to different hardware, we can’t directly compare the signal strength. However, we are able to obtain some repeatable results:

  1. There is no change in desense between Gen1 and Gen2 link speeds
  2. There is improvement in the signal quality when spread spectrum is disabled.
  3. GPS desense is still present whether SS is enabled or disabled.

We also decided to reconfigure the Gen2 card back to the ASPM L1 mode and run the same test:

  • USB3-PCIe, baseline (all devices are off): 47 dB
  • USB3-PCIe, Gen1 link speed, SS on: 46 dB
  • USB3-PCIe, Gen1 link speed, SS off: 45 dB
  • USB3-PCIe, Gen2 link speed, SS on: 46 dB
  • USB3-PCIe, Gen2 link speed, SS off: 45 dB

The results this time seem to contradict the previous findings. There is still no change between Gen1 and Gen2 link speeds, but disabling SS appears to make the GPS desense marginally worse.

We were able to use a near field probe with the Gen2 card in the Jetson devkit and SS disabled. The probing showed modulated spikes placed at 210 KHz in the GPS band when running at Gen1 link speeds. The spikes were at 420 KHz when running at Gen2 link speeds.

The amplitude of the spikes are reduced when SS is enabled, but it looks like they are still present. We didn’t spot this before because the spikes are close to the noise floor when SS is enabled. The noise floor is also higher when SS is enabled.

Our WLAN card also shows modulated spikes at 70 KHz within the GPS band.

We don’t see these spikes when the PCIe card is removed.

On the devkit we were able to verify that the center of the REFCLK spread was below 100 MHz. However, it did extend slightly above 100 MHz.

CLK_RST_CONTROLLER_PLLE_SS_CNTL_0 is configured for down-spread (PLLE_SSCINVERT and PLLE_SSCCENTER are both 0).

Follow-up questions:

  1. Is there any component in the TX1 operating between 70 KHz and 420 KHz?

  2. I attempted to modify the spread control fields in CLK_RST_CONTROLLER_PLLE_SS_CNTL_0 (specifically PLLE_SSCINCINTRV and PLLE_SSCMAX), but encountered some unexpected errors on the bus. Are there other “safe” SS settings that can be used to change the triangle generator?

70khz and 420khz, seems like switching frequency of DC/DC. You can try changing value of filter capacitors on input power rails of PCIe. For example, change C14 & C11 (on jetson carrier board) to 2.2uF or other value so as to better filter low frequency.

Hi folks,

A couple notes to expand upon Robb’s notes:
-We observe NO traffic with the oscilloscope when probing PCIE_TX, PCIE_RX in the L1 state. REFCLK is still running.

Here are the spectrum analyzer captures where we’re near-field probing our custom carrier board, near the PCIe connector, with a half-size card installed:
Spread spectrum OFF:

Spread spectrum ON:

Note: The spikes are still present with spread spectrum on, they’re just lower in amplitude than in the off case. When we power down the Jetson board, the spikes disappear. (sorry, didn’t save the screenshot of this one.)

-Again when near-field probing near the PCIe connector, we observe a consistent peak at 2.5GHz regardless of being in Gen1 or Gen2 mode. A peak at 1.25GHz exists in Gen1 mode (as expected), but dissappears in Gen2 mode. I suspect the 2.5GHz peak is PLL “E”. Is it possible to adjust the output frequency to confirm this isn’t involved in the desense? If that is not possible, is it possible to adjust the multiplier and divisor to achieve the same output frequency? We couldn’t find registers for this in the TRM.

Trumany, we will add the changed filter cap experiment to our test queue. Our carrier board has very similar decoupling, but our switching power supplies are significantly different.

Thanks folks for the ideas,
Mike

Earlier I had suggested that because GPS frequency of 1.575GHz differed from a prominent spike at 1.25GHz (a difference of 325MHz) that you might search for a 325MHz spike. I am still curious about this (the 1.25GHz and 325MHz could create 1.575GHz if there is a suitable mixer…and the spectrum analyzer might not see the resulting 1.575GHz if the mixer location is something odd or unexpected…but if there is a 325MHz spike then eliminating that from reaching the mixer would eliminate any 1.575GHz harmonic produced by mixing).

We ran a few more tests yesterday and today:

  1. REFCLK was updated to run at non 100 MHz clock rates.

The multipliers and dividers in CLK_RST_CONTROLLER_PLLE_BASE_0 were updated to shift the REFCLK frequency. We varied the frequency from 89.59 MHz to 100.81 MHz.

There was no change in desense.

I understand this is invalid as far as the PCIe specification is concerned, but we’re simply trying to find the source of the problem.

(When implementing this, we discovered an error in the TX1 TRM v1.2 on how to configure the PLDIV parameter for the clocks. Table 18 (Divider Control for CLKOUT) indicates that a PLDIV of 0b1110 results in a CLKOUT of vcoclock/32. In reality, it produces a CLKOUT of vcoclock/24. Our observation seems to align with the L4T source in drivers/platform/tegra/tegra21_clocks.c.)

  1. We updated the clock input reference for PLLE.

CLK_RST_CONTROLLER_PLLE_AUX_0 was updated to set PLLE_REF_SRC. This moved the clock input from the main oscillator (38.4 MHz) to PLLP (408 MHz).

CLK_RST_CONTROLLER_PLLE_BASE_0 was also updated to set the multipliers/dividers to produce a 100 MHz output with the new input clock.

No change in desense was observed by switching to PLLP.

  1. Changing the amplitude of the PCIe TX lanes.

We updated the TX_DRV_AMP in the T_PCIE2_RP_ECTL_1_R1 register and observed a linear improvement in desense: the lower the amplitude, the less interference with GPS. It does not however, completely eliminate the desense.

The decrease in TX amplitude was verified in an eye-diagram.

The TX_DRV_AMP was modified from the default value of 0x1f down to 0x06 before encountering problems with the PCIe interface.

Since changing REFCLK and PLLE doesn’t affect desense, but modifying T_PCIE2_RP_ECTL_1_R1 does, I am suspecting that the source may be from the PEX/USB3 Brick. This aligns with our previous observations where disabling the TX/RX lanes and leaving REFCLK up eliminated desense.

Most of the registers that may configure these clocks are not fully documented. Can you provide some advice on which registers we can experiment with?

I suspect there may even be modes where we can change the clock multipliers and dividers to different values but still produce the correct output. This may shift the interference with GPS.

We’re also open to testing slightly out of sync clock rates to see if the level of desense changed. This may not be a solution that is shippable, but it is valuable to understand and it could help us create a feasible solution/workaround outside of the TX1.

Hi robb, as previous comments, these private registers are not public for reasons, and also we don not suggest to implement it by changing registers setting, as essentially it is a EMI/RF compliance problem. To find the source and moving location, shielding/grounding the interference are the best methods.

how to change registers’s value on TX2

You’ll want to ask with a “new” question in the TX2 forum (this is TX1):
https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-tx2/81

You’ll also need to give some context, such as which register and what you are trying to accomplish (if the register is part of some other system, e.g., GPIO, it might have a framework to follow; conversely, maybe it is something for inline assembler, or a tool if just testing).