How to disable SMMU on TX2 platform??

wlinzhen · June 25, 2018, 10:32am

I’ve encountered two problems while I was trying to use mmap and pci_alloc_consistent function to map the memory to user space. But it seems that the physycal address I acquired is wrong.
The report shows beneath:
Unhandled fault: level 3 address size fault (0x92000003) at 0x0000007f7bd7a000
and
arm-smmu 12000000.iommu: Unhandled context fault: iova=0x80206000, fsynr=0x13, cb=22, sid=17(0x11 - AFI), pgd=25f705003, pud=25f705003, pmd=fb2fd003, pte=0

I’ve learned from the forumn that maybe I need to disable SMMU.
I’m new to Jetson TX2, so I really need help to solve this problem.

linuxdev · June 25, 2018, 3:33pm

It may depend on which version of L4T you are using. See:
[url]https://devtalk.nvidia.com/default/topic/1014290/jetson-tx2/how-to-map-kernel-memory-to-user-memory-/post/5172167/#5172167[/url]
[url]https://devtalk.nvidia.com/default/topic/1032474/jetson-tx2/10g-ethernet-for-jetson-tx2-using-pci-e-x4/post/5252746/#5252746[/url]

Actual instructions for putting a device tree in place have changed over time.

wlinzhen · July 5, 2018, 2:58am

Thanks for your answer. My L4T version is R28.1, so I think I just need to remove following:

“#stream-id-cells = <1>;” from “tegra_pcie” node
“<&{/pcie-controller@10003000} TEGRA_SID_AFI>,” from “smmu” node
is it rignt?

I have found these two sentences in tegra186-soc-base.dtsi. Does it mean I just need to modifiy this file and rebuild the dtb in kernel4.4 file ,use command :
make O=$TEGRA_KERNEL_OUT dtbs

Then replace the original in Linux_for_Tegra/dtb, flash the Jetson TX2 device. Does it work?

linuxdev · July 5, 2018, 5:30am

Rebuilding through the kernel source is one way. You’d probably download first via the driver package source_sync.sh app:

./source_sync.sh -k tegra-l4t-r28.1

If you need your existing config start with the running Jetson’s “/proc/config.gz”. I usually do those builds right on the Jetson…the source_sync.sh script can be copied anywhere you like to get the source.

On the devel kit TX2 the dtb normally flashed is found here within the driver package (and is what you’d look for after building your kernel to place there before doing the dtb flash command…if your hardware differs, e.g., a different carrier board, then the dtb would differ):

Linux_for_Tegra_64_tx2/kernel/dtb/tegra186-quill-p3310-1000-c03-00-base.dtb

There have been some patches and bugs so just ask if you run into an issue, but the dtb flash command should be this while the Jetson is in recovery mode:

sudo ./flash.sh -r -k kernel-dtb jetson-tx2 mmcblk0p1

Incidentally, you could reverse compile tegra186-quill-p3310-1000-c03-00-base.dtb, edit that, and then recompile (avoids building from the kernel source…flash is still the same). E.g.:

# Create a dts:
dtc -I dtb -O dts -o modified_tegra186-quill-p3310-1000-c03-00-base.dts tegra186-quill-p3310-1000-c03-00-base.dtb
# Edit modified_tegra186-quill-p3310-1000-c03-00-base.dts
# Recompile:
dtc -I dts -O dtb -o new_tegra186-quill-p3310-1000-c03-00-base.dtb modified_tegra186-quill-p3310-1000-c03-00-base.dts
# Copy it...perhaps save a backup first of the original tegra186-quill-p3310-1000-c03-00-base.dtb
cp new_tegra186-quill-p3310-1000-c03-00-base.dtb /where/ever/it/is/Linux_for_Tegra/kernel/dtb/tegra186-quill-p3310-1000-c03-00-base.dtb
# Flash as in the previous note...

Consider cloning any time you do something risky. I don’t consider replacing the device tree risky since it can just be repeated until right. On the other hand, if you forgot to use the right arguments it might also flash the rootfs. See:
https://devtalk.nvidia.com/default/topic/1000105/jetson-tx2/tx2-cloning/

wlinzhen · July 5, 2018, 10:00am

Hi linuxdev
I used your suggestion and now I think I have succeeded disable the smmu for pcie.
I use ls /sys/kernel/debug/12000000.iommu/masters/
and here is the results :
ls /sys/kernel/debug/12000000.iommu/masters/
13e10000.host1x 150c0000.nvcsi 15700000.vi 3460000.sdhci
13e10000.host1x:ctx0 15100000.tsecb 15810000.se 3510000.hda
13e10000.host1x:ctx1 15210000.nvdisplay 15820000.se 3530000.xhci
13e10000.host1x:ctx2 15340000.vic 15830000.se 3550000.xudc
13e10000.host1x:ctx3 15380000.nvjpg 15840000.se adsp_audio
13e10000.host1x:ctx4 15480000.nvdec 17000000.gp10b smmu_test
13e10000.host1x:ctx5 154c0000.nvenc 2490000.ether_qos sound_ref
13e10000.host1x:ctx6 15500000.tsec 2993000.adsp
13e10000.host1x:ctx7 15600000.isp 3400000.sdhci

Although I think the smmu is disabled, the problem seems still exist. I have used pci_alloc_consistent() and kmalloc() function to apply for memory. I use __pa() to change the virt_addr to phys_addr, but the transform results seems odd for me.

pci_alloc_consistent()
virt_addr is ffffff8000e83000, phys_addr is ffffffc080e83000, bus_addr is fc002000
virt_addr is ffffff8000e84000, phys_addr is ffffffc080e84000, bus_addr is fc003000
virt_addr is ffffff8000e85000, phys_addr is ffffffc080e85000, bus_addr is fc004000
virt_addr is ffffff8000e86000, phys_addr is ffffffc080e86000, bus_addr is fc005000
virt_addr is ffffff8000e87000, phys_addr is ffffffc080e87000, bus_addr is fc006000
virt_addr is ffffff8000e88000, phys_addr is ffffffc080e88000, bus_addr is fc007000
virt_addr is ffffff8000e89000, phys_addr is ffffffc080e89000, bus_addr is fc008000
virt_addr is ffffff8000e8a000, phys_addr is ffffffc080e8a000, bus_addr is fc009000
virt_addr is ffffff8000e8b000, phys_addr is ffffffc080e8b000, bus_addr is fc00a000
virt_addr is ffffff8000e8c000, phys_addr is ffffffc080e8c000, bus_addr is fc00b000

kmalloc()
virt_addr buffer is ffffffc1e1ebe000, phys_addr buffer is 261ebe000
virt_addr buffer is ffffffc1e1ebd000, phys_addr buffer is 261ebd000
virt_addr buffer is ffffffc1e1eb8000, phys_addr buffer is 261eb8000
virt_addr buffer is ffffffc1e1ebc000, phys_addr buffer is 261ebc000
virt_addr buffer is ffffffc1e1eb9000, phys_addr buffer is 261eb9000
virt_addr buffer is ffffffc1e61a5000, phys_addr buffer is 2661a5000
virt_addr buffer is ffffffc1e61a6000, phys_addr buffer is 2661a6000
virt_addr buffer is ffffffc1e61a7000, phys_addr buffer is 2661a7000
virt_addr buffer is ffffffc1e61a3000, phys_addr buffer is 2661a3000
virt_addr buffer is ffffffc1e61a2000, phys_addr buffer is 2661a2000

wlinzhen · July 5, 2018, 10:08am

As you can see, the phys_addr is quite different from each other. And I think the the phys_addr of pci_alloc_consistent() is wrong cause it similar to virt_addr and it didn’t work to in mmap() function.

linuxdev · July 5, 2018, 8:22pm

Someone else will need to comment, but it would be expected that an address the PCIe device is told about for DMA changes each transfer. If you have reallocated physical memory in kernel space on each transfer, then this too would probably change each transfer (e.g., a kmalloc each transfer instead of re-using a buffer). If the PCIe itself does not complain about the DMA transfer, then I’d think it is the mmap failing.

I do not know enough to debug this in detail, but I found something which is perhaps related…especially if it is an FPGA with user space software:
[url]https://forums.xilinx.com/t5/Embedded-Linux/AXI-DMA-transfer-from-user-space/td-p/739954[/url]

In that case mmap was not being performed as expected since user space never has physical address access and there was a crossing from user to kernel space. I would check out the mmap, and if that isn’t the issue, then someone knowing more about the PCIe DMA driver might need to comment.

wlinzhen · July 6, 2018, 4:29am

Thanks very much! This FPGA issue is the problem I am facing. I will read that and try your suggestion first.

wlinzhen · July 6, 2018, 6:06am

……I misunderstand why to disable SMMU, when I disabled the SMMU. The phys_addr and the bus_addr is the smae one. I just need to use pci_alloc_consistent() to get a bus_addr and use it as phys_addr as well. Now the problem is solved and I can use mmap function too. Thanks again for your help ^^

linuxdev · July 6, 2018, 4:32pm

Correct. The usual reason for disabling SMMU is because an older (binary) driver cannot be changed to use the virtual address-to-physical-address translation. Disabling SMMU means no need to do that. On a driver you are writing yourself you’d want to account for translation since disabling SMMU effectively cuts DMA address space in half from 64-bit to 32-bit (which won’t be a problem for a lot of people).