We have a Jetson AGX Xavier on a 3U VPX (Wolf 12TP) connected to an FPGA on another 3U VPX card. We have instantiated a Xilinx XDMA endpoint on the FPGA, as well as an example from the FPGA vendor. The device will enumerate on the bus, and lspci reports we are getting a gen3x4 connection. However, when we attempt to read any address for any BAR, we receive back all f’s. We have sent the binary of the vendor example to the FPGA manufacturer. They can get this to work if they use a different SBC as the host platform. We currently have JetPack 4.4.
We have tried to enable PCIe AER. We don’t see anything in the system log (dmesg). However, if we use PCIe configuration accesses to unmask the UEMsk and CEMsk registers, we do see what we think is an Unsupported Request error, and looking at the first four words of the TLP, we believe we see a valid TLP access attempt. Since this build does work using a different SBC, we do believe the firmware is good and there is something on the Xavier side that is causing the issue.
There is also a red LED that comes on for the board during boot on the Xaiver board. We are being told that this LED indicates either an over temp or PCIe failure condition. We have ruled out over temp. We have not determined what the PCIe failure could be. We are told this is coming from the FATAL_ERR# signal of the PLX 8718 switch. This comes on about 18 seconds after power is applied. Yet, even though there is an error, the PCIe endpoint enumerates, and supposedly trains to gen3x4. If we remove the FPGA from the chassis and power on the AGX by itself, the LED does not come on.
Are there any insights into how we can determine what could be causing this error condition for the LED? We are assuming at this point this is related to our inability to access the BAR regions.