Hello
I have a problem with Orin on our custom baseboard. The baseboard connects an FPGA with PCIe to the Jetson. The same system works fine with AGX Xavier (JetPack 4.6.1).
With Orin, The UEFI fails with an ASSERT.
Please find attached the full log (re-compiled UEFI in Debug mode).
The FPGA device exposes 2 bars (2GB and 128B, both 32-bit non-prefetch). As soon as I erase the FPGA (meaning no PCIe endpoint shown), the system boots fine.
I have JetPack 5.0.1_DP. While I had applied several changes to the device-tree, I have reverted them all to confirm that the problem is independent of those.
Are there any known (new) limitations with Jetson Orin (vs. Xavier)?
Thank you! Marc uefi_failure.log (77.0 KB)
Per checked the UEFI source, there is a limitation in where you hit. Please check if that is for your case, since your (Translation & Alignment) == 0
Translation = GetTranslationByResourceType (RootBridge, Index);
if ((Translation & Alignment) != 0) {
DEBUG ((
DEBUG_ERROR,
"[%a:%d] Translation %lx is not aligned to %lx!\n",
__FUNCTION__,
DEBUG_LINE_NUMBER,
Translation,
Alignment
));
ASSERT ((Translation & Alignment) == 0);
//
// This may be caused by too large alignment or too small
// Translation; pick the 1st possibility and return out of resource,
// which can also go thru the same process for out of resource
// outside the loop.
//
I can now boot and the device enumerates. I could not test the functionality yet, as I am debugging other issues.
Can you help me confirm that the address translation from Xavier is ok or suggest different translations?
Wayne,
It would be great if you could help on the correct address translation for the PCIe devices. I don’t have the full overview of the address space.
With my modifications as above, the system can boot and I can enumerate the PCIe device. As soon as I try to communicate with the PCIe device (write transaction), the kernel locks up with:
When I (temporary) reduce the BAR size on the FPGA and revert the ranges definition in the device tree, the PCIe transaction works fine. Thus, the address translation needs to be fixed.
Can you please help me on this? I don’t have the full overview of the address space for the Orin.
Thank you, Marc
I now can talk to the card and basic transactions don’t seem to cause problems. When I then start my DMA transfers, I get CBB errors (either immediately or after a few seconds of full operation).
I have now realized that also with the reduced BAR size, I am getting the same stability issues.
I am separating the topics. I have started a separate topics for the stablity issues / CBB errors:
On this topics, I would appreciate if you could suggest or confirm the adjustment of the address translation to work with larger BARs (2GB).
We can’t use Xavier’s ranges for Orin as is.
Instead, Orin’s ranges can be adjusted to have higher apertures for non-prefetchable BARs.
Please use the following adjusted ranges property and see if it works for your FPGA-based endpoint device.
BTW, it is rare to see devices with huge 32-bit Non-Prefetchable BARs. Is there any specific reason why this particular device needs to have such a huge 32-bit NP BAR? Can’t it have the same BAR as Prefetchable BAR?
Vidyas,
Thank you. The 32-bit is larger than really needed, but we need more than the 128MB. I am aware of other devices mapping this large memory area. As this is access to the on-chip bus, we can’t use prefetchable memory.
I will try your ranges when I am back from my vacation.
Marc
Vidyas,
Your configuration still fails with the alignment error.
I played a bit and it seems that the CPU address for the non-prefetchable memory must be aligned with 0x40000000.
I now see that my previous ranges had a problem (I missed that it should start at 0x27 40000000). As I don^t need mutch prefetchable memroy, I have further lowered the size of the prefetchable memory to make everything aligned:
I’m wondering where is this requirement coming from?
FWIW, your modification looks fine to me. Do you mean to say that it doesn’t work even with that? what exactly is the alignment error you are observing? Could you please paste the log?