allocate DMA at L4T 28.2

For the Jetson TX1 I try to allocate DMA memory for a PCIe Network card and map into user space.
My procedure is working on x86/amd64 and even on L4T 24.1 (Kernel 3.10.96) but not on the latest L4T 28.2 (Kernel 4.4.38).

#Situation
In my mmap function everything seems to work:

pVa = dma_alloc_coherent(&pDevDesc->pPcidev->dev, len, &dmaAddr, GFP_KERNEL);
memset(pVa, 0, len);
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
remap_pfn_range(vma, vma->vm_start, dmaAddr >> PAGE_SHIFT, len, vma->vm_page_prot)

*((u32 *) pVa) = (u32) dmaAddr;
PRINTK("mmap: mapped DMA memory, Phys:0x%p KVirt:0x%p UVirt:0x%p Size:0x%x\n",(void *)(unsigned long)dmaAddr, (void *)pVa, (void *)vma->vm_start, len); 
// => mmap: mapped DMA memory, Phys:0x0000000080000000 KVirt:0xffffff800cf7d000 UVirt:0x0000007fb79a0000 Size:0x81000

In the application I get memory from the mmap-function but it is not the same as in the kernel object.
It’s not memset to zero and the physical address I wrote in first 4 Byte is missing.
When I continue and start using the memory, the Ubuntu freezes and on the screen appears a strange pattern.

#Questions
Is this a Bug or is this caused by the SMMU?
Is there a way to handle this without customizing the Kernel?
Do you have any suggestion how I should proceed?

I can’t tell you how to solve (I haven’t looked at DMA in a long time), but you are probably correct about SMMU involvement. R24.1 was an initial release and was 32-bit user space, but 64-bit kernel space. The DMA address space was by default 32-bit and used physical address directly. After going to 64-bit there were some PCI DMA devices which did not function correctly because they were expecting physical address, but it was really the SMMU they needed to talk to. There are some minor changes to how to request and use the address when the SMMU is involved. Once that is accounted for it should work (but I don’t recall the specific changes for that).

Please apply the below patch to disable SMMU for PCIe

diff --git a/kernel-dts/tegra210-soc/tegra210-soc-base.dtsi b/kernel-dts/tegra210-soc/tegra210-soc-base.dtsi
index dacc13a19cb9..524aff2dbda3 100644
--- a/kernel-dts/tegra210-soc/tegra210-soc-base.dtsi
+++ b/kernel-dts/tegra210-soc/tegra210-soc-base.dtsi
@@ -1629,8 +1629,6 @@
                pinctrl-4 = <&pex_io_dpd_disable_state>;
                pinctrl-5 = <&pex_io_dpd_enable_state>;
  
-               iommus = <&smmu TEGRA_SWGROUP_AFI>;
-
                bus-range = <0x00 0xff>;
                #address-cells = <3>;
                #size-cells = <2>;

Thank you vdyas for your reply.
I tried this, but it didn’t change anything.

Are you sure that my problem is caused by the smmu?

how did you make sure that system has booted with updated dtb? Can you please verify it by checking respective pcie entry/proc/device-tree/ ?

The smmu is still inside.
I exchanged the dtb-files in /boot/ and /boot/dtb/
But it not used.

Where is the used device-tree? Or how can I change this?

It used to be the device tree was that file, but parts began to be used in early boot stages (and early boot stages can’t read ext4). So it is now flashed as a partition. Details change depending on release, but see this (if you have questions for a specific release just ask…most of this will be relevant):
https://devtalk.nvidia.com/default/topic/1031374/jetson-tx1/how-to-update-only-dtb-in-r28-2/post/5247510/#5247510

Basic example (after putting the Jetson in recovery mode and the new dtb in the correct location):

sudo ./flash.sh -r -k DTB jetson-tx1 mmcblk0p1

FYI, it is a good idea to have a clone backup of the rootfs until you know whether this does what you want without flashing the root partition.

Thanks, with the SMMU turned off it works.

But the command

sudo ./flash.sh -r -k DTB jetson-tx1 mmcblk0p1

and also without the “-r”.

sudo ./flash.sh -k DTB jetson-tx1 mmcblk0p1

does’t flash the device-tree files.
With all it’s prints it seems to work, but nothing happens.

I finally flashed every

sudo ./flash.sh jetson-tx1 mmcblk0p1

then it worked.