How to disable IOMMU in Jetson Orin

Hi

I’m working on transfer data between 2 Orin through PCIE switch. After using dma_map_xxx to map source address(by kmalloc) and destination address(BAR physical address), I submit the DMA transfer and there are errors like:

d@d-desktop:/sys/kernel/debug/ntb_perf/0005:01:00.1$ sudo echo 0 > run
[  238.539299] irq 14: nobody cared (try booting with the "irqpoll" option)
[  238.539758] handlers:
[  238.539832] [<0000000074e49222>] tegra_mcerr_hard_irq threaded [<00000000b6412363>] tegra_mcerr_thread
[  238.540103] Disabling IRQ #14
[  238.559003] mc-err: (255) csw_pcie5w: EMEM address decode error
[  238.559423] mc-err:   status = 0x200100e3; hi_addr_reg = 0x00000027 addr = 0x2740080200
[  238.559897] mc-err:   secure: no, access-type: write
[  238.646719] mc-err: (255) csw_pcie5w: EMEM address decode error
[  238.647208] mc-err:   status = 0x200100e3; hi_addr_reg = 0x00000027 addr = 0x27400b3c00
[  238.647736] mc-err:   secure: no, access-type: write
d@d-desktop:/sys/kernel/debug/ntb_perf/0005:01:00.1$ [  238.754684] mc-err: (255) csw_pcie5w: EMEM address decode error
[  238.754872] mc-err:   status = 0x200100e3; hi_addr_reg = 0x00000027 addr = 0x27400b2e00
[  238.755290] mc-err:   secure: no, access-type: write
[  238.858871] mc-err: (255) csw_pcie5w: EMEM address decode error
[  238.859160] mc-err:   status = 0x200100e3; hi_addr_reg = 0x00000027 addr = 0x2740080000
[  238.859506] mc-err:   secure: no, access-type: write
[  238.963471] mc-err: Too many MC errors; throttling printsd@d-desktop:/sys/kernel/debug/ntb_perf/0005:01:00.1$

However I can see the data flow on my switch monitor.


But it seems that the data has not been carried to the other Orin. I am quite sure there is no problem with the switch because if I use memcpy_toio(instead of DMA) with exactly the same source address and destination address, it works and there is data received by the other Orin.

I prefer there are something wrong with Orin’s IOMMU which is possible that the IOMMU is blocking access by the switch-dma to host memory.

Is this because of iommu? How to fix it?

My BSP version is JetPack 5.0.1.

Thanks.

Hi,

Please try these steps to disable IOMMU on Orin

  1. Remove iommus property from client dt node. Take nvjpg as example.

     nvjpg@15380000 {
                iommus = <&smmu TEGRA_SID_NVJPG>; <= Remove or comment this line
    
                dma-coherent; <= Remove or comment this line if present
    
     };
    
  2. Add this patch to kernel config and replace kernel image. if your kernel has that property set.

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 80fa387..64aac59 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -1228,6 +1228,7 @@
 CONFIG_TEGRA_IOMMU_SMMU=y
 CONFIG_ARM_SMMU=y
 CONFIG_ARM_SMMU_DEBUG=y
+# CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is not set
 CONFIG_ARM_SMMU_V3=y
 CONFIG_SOUNDWIRE=m
 CONFIG_ARCH_TEGRA_210_SOC=y

Hi Wayne,

Thanks for reply.

  1. Are you sure only these 2 properties(iommus and dma-coherent) are required to remove? I think maybe other 2 properties “iommus-map” and “iommus-map-mask” should also be removed.

  2. There is another question about DMA mapping:

According to Dynamic DMA mapping Guide — The Linux Kernel documentation

During the enumeration process, the kernel learns about I/O devices and their MMIO space and the host bridges that connect them to the system. For example, if a PCI device has a BAR, the kernel reads the bus address (A) from the BAR and converts it to a CPU physical address (B). The address B is stored in a struct resource and usually exposed via /proc/iomem. When a driver claims a device, it typically uses ioremap() to map physical address B at a virtual address (C). It can then use, e.g., ioread32(C), to access the device registers at bus address A.

If the device supports DMA, the driver sets up a buffer using kmalloc() or a similar interface, which returns a virtual address (X). The virtual memory system maps X to a physical address (Y) in system RAM. The driver can use virtual address X to access the buffer, but the device itself cannot because DMA doesn’t go through the CPU virtual memory system.

In some simple systems, the device can do DMA directly to physical address Y. But in many others, there is IOMMU hardware that translates DMA addresses to physical addresses, e.g., it translates Z to Y. This is part of the reason for the DMA API: the driver can give a virtual address X to an interface like dma_map_single(), which sets up any required IOMMU mapping and returns the DMA address Z. The driver then tells the device to do DMA to Z, and the IOMMU maps it to the buffer at address Y in system RAM.

Now I got a BAR physical address(B in the graph above) and a user defined kmalloc address(X in the graph above).
What should I do if I want to use DMA transfer between B and X? I know X is easily mapped with Z(dma_addr) by dma_map_single. But what about B? How to map B with a dma available address?

Regards,
ezra

Hi Wayne,

I have already disabled IOMMU following your step.

However, I found not only transfer by dma, but also transfer by memcpy to the device memory where I disable IOMMU is really slow. Before disable IOMMU, memcpy speed was about 2.5GB/s, now it’s only 980MB/s.

Is there any way to avoid this slow performance?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.