I have a custom carrier board for Orin AGX using JETPACK 6.2. In my topology the Orin is connected as a root to a pcie switch with two other fpgas. What I need is that the FPGAs must be allowed to send data beetween them through the switch freely without involving the Orin (pcie peer to peer).
The questions:
Is possible to achieve this without disabling the SMMU?
What is happening:
Read/write from Orin to the FPGA BAR is working.
Read/write from the FPGA to Orin (using FPGA dma controller) is working (alas: Orin set up a dma memory where to receive/send fpga data, then initialize the dma controller located on FPGA and triggers the transaction).
When I try to do an FPGA write to the other FPGA BAR address (the one that is read by lspci), the transaction never arrives to the FPGA but is redirected to Orin which spits smmu error on unexpected transaction (the IOVA is indeed the FPGA destination BAR). I think to understand why: the enabled IOMMU set up the PCIE switch ACS to redirects the transaction through him.
I also tried to disable SMMU on my PCIE controller (C5) pcie@141a0000 commenting out in dts “iommus”, “dma-coherent”, “iommu-map-mask”, “iommu-map”. I also recompiled the kernel with CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y.
What is happening:
lspci doesn’t list anymore a MMU group for the FPGAs (seems to work?), BUT I see ACS enabled on switch and this is worrying (ACSCtl: SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans-)
Orin to FPGA BAR read/write working.
I can’t anymore do a FPGA dma read/write from/to ORIN (PCIE bus error CmpltTO)
(is possible to edit post to add new insights? Because do a new post for a small addition seems wrong)
On last post I tried with CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y. Today I also tried with CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=n, that maybe makes more sense, but it changes nothing.
Hi Wayne, thank you for your hint.
I managed to make it work also with upstream kernel and MMU enabled. I wrote this steps to help others in my situation.
it is really possible to make a pcie peer to peer transaction without Orin involvement also with enabled MMU (at first I thought it was impossible)
No need to change anything in DTS nor in KERNEL configuration (CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y is perfectly fine!)
what you need to do is to disable ACS redirection on your switch, there is an useful kernel command line to achieve this:
pci=disable_acs_redir=pci:<VENDOR_ID>:<DEV_ID>
In my case, for a PM8562 microsemi switch: pci=disable_acs_redir=pci:11f8:8562
You need to get the real pcie bus address for the target BAR(s), not the one returned by lspci, not the one returned by pci_resource_start(). Your friend is : pci_bus_address()! This is what gives me a lot of headache because, also with ACS disabled, I keep getting transaction to the host. This is because the address that I used wasn’t the real pcie bus address but a translated one, so the switch, not recognizing the address, always forwarded the TLP to the host.
I verified with PCIE switch packet counters that Peer to peer is working without passing through the host, just what I need.
Thank you all for the support, I hope my findings help some other too.
Just a question. Will your step in (4) still be there is SMMU is disabled? I mean, you said you always get the translated one. So if SMMU is not there, will you still get that?
And here is the proof of bus address != resource_start address also with disabled IOMMU (there are some debug prints - the MATTIA ones - , I just want to be sure that IOMMU was disabled for this controller)
To be fair, I can’t make it work at all with disabled MMU, I can’t even do a dma transfer between FPGA dma controller and ORIN. So I never tried peer to peer FPGA to FPGA transaction with disabled MMU.
EDIT: on second thought, I think is normal that resource_start doesn’t return the same address as get bus_address, because resource start should return a physical address mappable by ioremap, instead the other function returns a pcie bus address. I think these two can be different also if no MMU is involved.