FPGA based DMA on TX1 - IOVA mapping issue


We are developing a custom logic on FPGA that supports both video input and output. Interfaced over PCIe on TX1 running R24.2, they are exported as V4L2 and FB devices in Linux.

The FPGA is capable of performing the DMA operation from/to FPGA DDR to/from TX1 DDR allocated via dma_alloc_coherant and v4l2 sub-system. Following errors are reported when DMA is enabled.

smmu_dump_pagetable(): fault_address=0x00000000fe000000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 40.569545] mc-err: (0) csr_afir: EMEM decode error on PDE or PTE entry
[ 40.576211] mc-err: status = 0x6000000e; addr = 0xfe000000
[ 40.581885] mc-err: secure: no, access-type: read, SMMU fault: nr-nw-s

Obviously we are getting this because there is no SMMU mapping for the PCIe to the allocated DDR location. What is the right way to achieve the necessary functionality?

  1. Should we allocate like above and create a mapping in SMMU? If so, what are the relevant APIs in TX1?
  2. Is it possible to create a IOVA mapping as and when memory is allocated? If so, what are the relevant APIs/flags in TX1?

Thanks in advance.

Is your FPGA PCIe end point device sitting behind a PCIe bridge?

No, there are no bridges between. The FPGA (as PCIe endpoint) is directly to the PEX4 (J2) connector of the Jetson TX1 Developer Kit Carrier board.

do you see this error after some time or immediately after starting the DMA operation?
reason for this question is, generally if SMMU is enabled for PCIe, dma_alloc_coherent() API starts returning addresses starting from 0x80000000 (max address being 0xFFF00000) and it would come to 0xFE000000 (as reported by SMMU error) after all earlier memory regions are exhausted. If the issue is seen immediately, then, it is very likely that this memory location is not a legitimate one and there may not be SMMU tables updated with this entry.

We are getting this as soon as the DMA is enabled. In fact, the address 0xFE000000 is where the DMA descriptors are stored.

Regarding dma_alloc_coherant, we are allocating memory with (GFP_KERNAL | GFP_DMA32) flags. Total allocation stands around 40MB.

Any other flags are needed apart of these? And is the IOVA<->PA mapping one to one?

Any specific reason why GFP_DMA32 is passed? GFP_KERNEL alone should be enough. Can you please try with only GFP_KERNEL?

Even using GFP_KERNEL does not make any difference. Same behavior.
BTW will this automatically create a IOVA mapping for PCIe?

Yes. calling dma_alloc_coherent() API would create IOVA mapping for PCIe.
Possible to privately send the driver for us to take a look at it once?
Since it is working for all other upstreamed PCIe end point driver, don’t see any reason why it shouldn’t work in your case.

Hi vidyas,

I have sent the code to you privatly two days back. Hope you could have a look at it. Kindly comment

Commented offline.

It is our mistake. We had passed dma_alloc_coherent() API with platform device pointer rather than the pci device pointer. After changing it, the mapping is automatically created.

Thanks to @vidyas for the solution.

good to hear that.