iommu unhandled context fault on PCI device DMA

I am having a problem with driver for a PCI device located behind three transparent P2P bridges.
Unfortunately it’s not possible to connect the device directly to the TX2 PCIe slot.

The driver allocates a buffer for DMA descriptors which it then passes to the device.
When the device attempts to fetch a descriptor, the following message gets logged on the console:

arm-smmu 12000000.iommu: Unhandled context fault: iova=0x269ea1000, fsynr=0x10001, cb=21, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0

the iova is correct bus address returned from dma_map_single. Here is the code fragment:

buf->virt = kmalloc(buf->size, GFP_KERNEL);
buf->phys = dma_map_single(&dev->dev, buf->virt, buf->size, DMA_BIDIRECTIONAL);

I also tried the dma_alloc_coherent, it resulted in very same iommu error.

Anyone knows how to fix/workaround this issue?

IOVA 0x269ea1000 seems to be wrong.
For PCIe, all IOVAs must fall with in 0x8000_0000 ~ 0xFFF0_0000 range.
Are you saying the return value of dma_map_single() is 0x269ea1000??

I’ve been getting a similar message (on a TX2, r27.1):

arm-smmu 12000000.iommu: Unhandled context fault: iova=0x68cf4000, fsynr=0x80011, cb=21, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0

Which seems to occur when my PCI device sends an MSI interrupt (they don’t get though to my ISR’s).

The iova address looks very similar to the MSI address given in lspci -vv, but it is only the bottom 32bits.


Capabilities: [50] MSI: Enable+ Count=1/4 Maskable- 64bit+
Address: 0000000268cf4000 Data: 0000

Any ideas? Could it be a 64bit/32bit addressing problem?

Following patch should solve the issue

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 85a1bbe..79d5ab6 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -3044,7 +3044,7 @@ static int tegra_pcie_enable_msi(struct tegra_pcie *pcie, bool no_init)
        }

        /* setup AFI/FPCI range */
-       msi->pages = __get_free_pages(GFP_DMA32, 0);
+       msi->pages = __get_free_pages(GFP_DMA, 0);
    }
    base = virt_to_phys((void *)msi->pages);

let us know the result.

Could it be that we have more of these 32/64 bit issues?

On TX2
virt_to_page gives back 3ffffbcc203a040 which seems odd and leads to an unavoidable “Unable to handle kernel paging request at virtual address” when you try to use that.

The addressing does not seem to be that off but virt_to_page seems to do something weird.

TX2: (with patch at least driver loads )
[ 63.525223] : Allocate DMA buffer [ffffff8000e81000] bus: [ffffffc1ec9418c0] , size 0x00400000.
[ 63.535384] : Allocate DMA buffer [ffffff8001281000] bus: [ffffffc1ec9418c8] , size 0x00400000.

TX1:
[ 10.835979] : Allocate DMA buffer [ffffffc057800000] bus: [ffffffc0f85624c0] , size 0x00400000.
[ 10.835986] : Allocate DMA buffer [ffffffc057c00000] bus: [ffffffc0f85624c8] , size 0x00400000.

Thanks vidyas. That patch fixed the MSI interrupt issue.

on memory that is created with dma_alloc_coherent my code works well on i386 / x86_64 and tx1 both 32 and 64 bit but not on TX2

TX2:

[ 1052.721167] virt -> FFFFFF8000E81000
[ 1052.721170] phys -> FFFFFFC080E81000
[ 1052.721172] pfn -> FFFFFFC080E81

TX1:

[ 1691.026382] virt -> FFFFFFC057800000
[ 1691.026388] phys -> D7800000
[ 1691.026392] pfn -> D7800

virt_to_phys result seems wrong

@MartijnBerger,
I think there is some issue with dev pointer in this case. Bus addresses must fall with in 0x8000_0000 ~ 0xFFF0_0000 range.

Hi vidyas,

I am attempting to fix this. Can you reproduce based on the information given or do you need something else ?

calling virt_to_phys() on the result of dma_alloc_coherent is the smallest path to having this go wrong i can find.

edit:
You may mail me directly on this if there is anything I can help you with, This is a time critical issue for me

BTW, whats the context in which virt_to_phys() is used?
Given that dma_alloc_coherent() already gives CPU virtual address (to facilitate any access to this region from CPU), may I know whats the reason to use virt_to_phys()?

In the minimal case there is no reason. But in the control flow of my driver it is useful.

But in general I am surprised that is does not work, so far all platforms i tried my code on the kernel has no problem and virt_to_phys() works as it should, this includes TX1. For some reason on TX2 is gives back what could be a pointer in the kernels address space.

I can probably work around it by getting dma_addr_t and relying on physical space and DMA space being the same but I am of the opinion that I should not have to and and that virt_to_phys should work.

This is because of the fact that SMMU is being enabled for PCIe.
Even in TX1, if 24.2 release is used (where SMMU is enabled for PCIe), virt_to_phys() macro would not give actual physical address. TX2 anyway has SMMU enabled for PCIe by default, so, virt_to_phys() doesn’t work as expected.
The reason being virt_to_phys() macro doesn’t have idea of SMMU page tables where the actual mapping from bus_address to physical_address is present.

dma_mmap_attrs() is also not working for me on TX2.

What API should I be using to get this memory into user space ?

Also dma_alloc_attrs() is giving 0x8000000 on first mapping of 4M and next allocation creates an address exactly 5M from that alloaction when judgeing by dma_addr_t

I am the same problem with V4L2 driver for a PCI device that supports three channel videos through TX2 PCIe slot.

The first channel is OK. But the others have errors.

I used the pci_alloc_coherent to alloc the memory for DMA, it return the virt and phy address correctly.

[ 40.350211] Virt addr is 1281000, Phys addr is 80500000

But when receive data, there are some errors.

[ 48.704536] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x804e0600, fsynr=0x40013, cb=21, sid=17(0x11 - AFI), pgd=26ff63003, pud=26ff63003, pmd=24d572003, pte=0

Anyone knows how to fix/workaround this issue?

Hello Vidyas

On TX2 board, r28.1. PCIE device using MSI interrupt. kernel report error:

arm-smmu 12000000.iommu: Unhandled context fault: iova=0xebe40000, fsynr=0x60001, cb=22, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0
arm-smmu 12000000.iommu: Unhandled context fault: iova=0x00000000, fsynr=0x60011, cb=22, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0

after I apply patch:
drivers/pci/host/pci-tegra.c

  •   msi->pages = __get_free_pages(GFP_DMA32, 0);
    
  •   msi->pages = __get_free_pages(GFP_DMA, 0);
    

rebuild kernel and driver and reboot TX2 and running our PCIE device driver with MSI interrupt coming in.
but the arm-smmu error is still there.

Is there other patch we have to apply for our driver case ?

We are also getting “arm-smmu 12000000.iommu: Unhandled context fault: iova” errors when trying to access GPU memory about 4GB limit on TX2 with 28.1 L4T kernel. I believe there might be other bugs in the kernel with improper 64bit addressing. The problem can be easily replicated running MemtestG80 (https://github.com/ihaque/memtestG80.git) on TX2, for instance

Attempt to allocate 4100 MB of RAM works fine, but 4200 is failing with the errors below:

#./memtestG80 4100 1

Final error count after 1 iterations over 4100 MiB of GPU memory: 4294967181 errors:

[396634.006720] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x84209100, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=1dbd28003, pte=0
[396634.021507] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x87e8ce40, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=180276003, pte=0
[396634.036266] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x8bb43440, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=180258003, pte=0
[396634.051014] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x0f621780, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=226781003, pte=0
[396634.065512] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x9334d880, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=18021c003, pte=0
[396634.080276] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x96ff3d40, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=1801fe003, pte=0
[396634.095038] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x1aaec5c0, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=226727003, pte=0
[396634.109566] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x9e837bc0, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=1802b8003, pte=0
[396634.124361] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x22360340, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=2266eb003, pte=0
[396634.138900] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x25efa5c0, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=2267ae003, pte=0

It’s hard to believe that NVidia didn’t test GPU memory allocations above 4GB on TX2 which is equipped with 8GB… dmesg snippet with errors is attached.

Any suggestions on how it can be solved? Do we need to apply another kernel patch to address it?

-albertr

dmesg.txt (34.7 KB)

vidyas,

Can you help with this issue? It can easily be reproduced by following the steps below:

  1. Clone MemtestG80 from https://github.com/ihaque/memtestG80.git

  2. Compile it for sm_62 (TX2) GPU type, i.e.:

root@tx2:/usr/src/memtestG80/Makefiles# cat Makefile.linux64
ZLIB_DIR=zlib/linux64
CUDA_DIR=/usr/local/cuda/lib64
CFLAGS=-DLINUX -O -Wall -m64
NVCCFLAGS=-DLINUX -L$(ZLIB_DIR) -L$(CUDA_DIR) -O -Xptxas -v -Xcompiler -Wall -m64 -gencode arch=compute_62,code=sm_62
NVCC=nvcc
CXX=g++

all: memtestG80

clean:
rm -f *.o
rm -f memtestG80

memtestG80_core.o: memtestG80_core.cu memtestG80_core.h
(NVCC) -c (NVCCFLAGS) -o memtestG80_core.o memtestG80_core.cu

memtestG80: memtestG80_core.o memtestG80_cli.cu
(NVCC) (NVCCFLAGS) -o memtestG80 memtestG80_core.o memtestG80_cli.cu

  1. Run MemtestG80 with > 4GB, i.e.:

root@tx2:/usr/src/memtestG80/Makefiles# ./memtestG80 4200 1

  1. Observe memory allocation failures in MemtestG80 and errors in kernel logs.

We really need to get this issue resolved. We are using TX2 and L4T 28.1 kernel. Please let us know.

-albertr

I’ll check and update in some time

Any news? I also encounted the problem albertr mentioned, and jetson_clocks.sh failed to work, ends with error:
/home/nvidia/jetson_clocks.sh: line 299: /sys/kernel/debug/bpmp/debug/clk/emc/rate: No such file or directory
/home/nvidia/jetson_clocks.sh: line 307: /sys/kernel/debug/bpmp/debug/clk/emc/mrq_rate_locked: No such file or directory

Finally, the problem solved. Please refer this post:
https://devtalk.nvidia.com/default/topic/1035808/altera-fpga-dma-to-tx2-via-pcie-problem/?offset=10#5269075