DMA transfer to FPGA hangs kernel

Hi All,

We are trying to get our Jetson AGX Orin to talk to an Arria10 FPGA through the PCIe bus. We are using an Arria10 SOM and a PCI carrier board designed by Reflex. The FPGA project is simply the reference design made by Altera/Intel and also the linux kernel module that is supposed to talk to the board is the reference code written by Intel (it is supplied with the PXIe HIP for the FGPA design). Since this kernel module code is relatively old we had to do some minor modifications to it to compile it for the most recent kernels, but we now have that working very well on several different Intel based platforms we have in house.

When we try to run this on the Orin, we however run into trouble. Straight read/write operations to the PCI bus are working fine, but DMA transfers lock up the kernel.

The way the driver works is a follows: First, a chuck of kernel memory is reserved for the DMA transfer with the function:

dev_bk->kmem_info.virt_addr = dma_alloc_coherent(&dev_bk->dev->dev, size, &dev_bk->kmem_info.bus_addr, GFP_KERNEL);

After that is done, the kernel memory is mapped to user space with the following code:

pfn = __pa(dev_bk->kmem_info.virt_addr + (pgoff<<PAGE_SHIFT))>>PAGE_SHIFT;
ret = remap_pfn_range(vma, vma->vm_start, pfn, len, vma->vm_page_prot);

When we start writing to the mapped memory, the kernel hangs with the following messages:

[ 1168.028592] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 1168.028786] rcu: 	8-...0: (2 GPs behind) idle=b0a/1/0x4000000000000000 softirq=4204/4205 fqs=423 
[ 1168.029023] rcu: 	11-...0: (2 ticks this GP) idle=61e/1/0x4000000000000000 softirq=3300/3301 fqs=423 
[ 1168.029285] 	(detected by 6, t=5258 jiffies, g=20305, q=67)
[ 1168.029287] Task dump for CPU 8:
[ 1168.029290] task:sugov:8         state:R  running task     stack:    0 pid:  576 ppid:     2 flags:0x0000002a
[ 1168.029297] Call trace:
[ 1168.029308]  __switch_to+0x104/0x160
[ 1168.029310] Task dump for CPU 11:
[ 1168.029312] task:intel_fpga_pcie state:R  running task     stack:    0 pid: 2471 ppid:  2470 flags:0x00000002
[ 1168.029317] Call trace:
[ 1168.029319]  __switch_to+0x104/0x160
[ 1168.029323]  0x0

The sugov process is also using 100% CPU power.

Based on some suggestions in other forums/posts I also tried to replace the call to remap_pfn_range with a call to dma_mmap_coherent, but that call locks the system up immediately even before writing to it (just mmapping will kill it). This makes it quite difficult to post any logs by the way ;-) The dma_mmap_coherent code also works fine on Intel based platforms, so the code is good.

I know memory management on ARM platforms is different and I am no expert on that. I am fresh out of ideas to try and I hope you can help me further.

I know it’s generally frowned upon to kick your own topic, but is there really nobody that can give me any pointers on this? Maybe a some links to relevant websites (I’ve tried StackOverflow already) or any good books? I’ll take anything.