Porting TSI721 device driver over from x86_64 to aarch64

I am attempting to use a PCIe2 to SRIO2 bridge adapter (TSI721) on two Jetson TX2s and test communication between them over a QSFP cable. The TX2s can detect these devices according to lspci. I have the driver for this device in x86_64 system form and its compatible Ubuntu 16.04, but the driver does not compile on the TX2 due to the missing function dma_cache_sync(). I have read that aarch64 architecture does not provide this DMA cache functionality to the user, and inn fact ARM does DMA differently. I verified its not in the header files on the TX2 as expected, so I thought it is automatically managed on aarch64 and simply removed this sync call, but was not sure if this was the right move. After that, the driver compiled and loaded successfully. However, when attempting to run my test program to communicate between the two Jetsons via these devices, it hangs at my ioctl() function call, which leads me to think I am not going about this correctly/Ive got a lot more to do. This is uncharted territory for me, but as far as I understand, aarch64 will use a different implementation of DMA than x86_64. I was comparing the two dma-mapping.h headers between arm and my Ubuntu host PC and thinking about whats the correct thing to do and if theres more to it than just refactoring DMA. If I want to get this driver to work on the Jetson TX2, what differences should I look out for and modify? Thank you!

Have you tried replacing dma_cache_sync() with dma_sync_single_for_cpu() or dma_sync_single_for_device() ?

Solved my own issue. Actually it seems like the sync function is not needed. This sync call was being made during descriptor setup. The issues I was facing after removing the function call were due to mapping. The main things I had to do to get the device to work:

  1. Remove dma_cache_sync() from driver code (replacing with dma_sync_single_cpu() also works)
  2. Increase dma coherent pool size
  3. Disable SMMU and change tegra pcie driver GFP_DMA32 zone to GFP_DMA zone

In the future I hope to be able to use SMMU by replacing virt_to_phys() calls with dma_map_single().