iommu unhandled context fault on PCI device DMA

gera_k · April 4, 2017, 7:29pm

I am having a problem with driver for a PCI device located behind three transparent P2P bridges.
Unfortunately it’s not possible to connect the device directly to the TX2 PCIe slot.

The driver allocates a buffer for DMA descriptors which it then passes to the device.
When the device attempts to fetch a descriptor, the following message gets logged on the console:

arm-smmu 12000000.iommu: Unhandled context fault: iova=0x269ea1000, fsynr=0x10001, cb=21, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0

the iova is correct bus address returned from dma_map_single. Here is the code fragment:

buf->virt = kmalloc(buf->size, GFP_KERNEL);
buf->phys = dma_map_single(&dev->dev, buf->virt, buf->size, DMA_BIDIRECTIONAL);

I also tried the dma_alloc_coherent, it resulted in very same iommu error.

Anyone knows how to fix/workaround this issue?

vidyas · April 8, 2017, 11:04am

IOVA 0x269ea1000 seems to be wrong.
For PCIe, all IOVAs must fall with in 0x8000_0000 ~ 0xFFF0_0000 range.
Are you saying the return value of dma_map_single() is 0x269ea1000??

MonkeyInMyPocket · April 12, 2017, 3:32pm

I’ve been getting a similar message (on a TX2, r27.1):

arm-smmu 12000000.iommu: Unhandled context fault: iova=0x68cf4000, fsynr=0x80011, cb=21, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0

Which seems to occur when my PCI device sends an MSI interrupt (they don’t get though to my ISR’s).

The iova address looks very similar to the MSI address given in lspci -vv, but it is only the bottom 32bits.

…
Capabilities: [50] MSI: Enable+ Count=1/4 Maskable- 64bit+
Address: 0000000268cf4000 Data: 0000
…

Any ideas? Could it be a 64bit/32bit addressing problem?

vidyas · April 13, 2017, 3:49pm

Following patch should solve the issue

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 85a1bbe..79d5ab6 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -3044,7 +3044,7 @@ static int tegra_pcie_enable_msi(struct tegra_pcie *pcie, bool no_init)
        }

        /* setup AFI/FPCI range */
-       msi->pages = __get_free_pages(GFP_DMA32, 0);
+       msi->pages = __get_free_pages(GFP_DMA, 0);
    }
    base = virt_to_phys((void *)msi->pages);

let us know the result.

MartijnBerger · April 21, 2017, 2:42pm

Could it be that we have more of these 32/64 bit issues?

On TX2
virt_to_page gives back 3ffffbcc203a040 which seems odd and leads to an unavoidable “Unable to handle kernel paging request at virtual address” when you try to use that.

The addressing does not seem to be that off but virt_to_page seems to do something weird.

TX2: (with patch at least driver loads )
[ 63.525223] : Allocate DMA buffer [ffffff8000e81000] bus: [ffffffc1ec9418c0] , size 0x00400000.
[ 63.535384] : Allocate DMA buffer [ffffff8001281000] bus: [ffffffc1ec9418c8] , size 0x00400000.

TX1:
[ 10.835979] : Allocate DMA buffer [ffffffc057800000] bus: [ffffffc0f85624c0] , size 0x00400000.
[ 10.835986] : Allocate DMA buffer [ffffffc057c00000] bus: [ffffffc0f85624c8] , size 0x00400000.

MonkeyInMyPocket · April 21, 2017, 4:18pm

Thanks vidyas. That patch fixed the MSI interrupt issue.

MartijnBerger · April 22, 2017, 7:19pm

on memory that is created with dma_alloc_coherent my code works well on i386 / x86_64 and tx1 both 32 and 64 bit but not on TX2

TX2:

[ 1052.721167] virt → FFFFFF8000E81000
[ 1052.721170] phys → FFFFFFC080E81000
[ 1052.721172] pfn → FFFFFFC080E81

TX1:

[ 1691.026382] virt → FFFFFFC057800000
[ 1691.026388] phys → D7800000
[ 1691.026392] pfn → D7800

virt_to_phys result seems wrong

vidyas · April 24, 2017, 5:54am

@MartijnBerger,
I think there is some issue with dev pointer in this case. Bus addresses must fall with in 0x8000_0000 ~ 0xFFF0_0000 range.

MartijnBerger · April 24, 2017, 6:20am

Hi vidyas,

I am attempting to fix this. Can you reproduce based on the information given or do you need something else ?

calling virt_to_phys() on the result of dma_alloc_coherent is the smallest path to having this go wrong i can find.

edit:
You may mail me directly on this if there is anything I can help you with, This is a time critical issue for me

vidyas · April 24, 2017, 11:51am

BTW, whats the context in which virt_to_phys() is used?
Given that dma_alloc_coherent() already gives CPU virtual address (to facilitate any access to this region from CPU), may I know whats the reason to use virt_to_phys()?

MartijnBerger · April 24, 2017, 2:02pm

In the minimal case there is no reason. But in the control flow of my driver it is useful.

But in general I am surprised that is does not work, so far all platforms i tried my code on the kernel has no problem and virt_to_phys() works as it should, this includes TX1. For some reason on TX2 is gives back what could be a pointer in the kernels address space.

I can probably work around it by getting dma_addr_t and relying on physical space and DMA space being the same but I am of the opinion that I should not have to and and that virt_to_phys should work.

vidyas · April 28, 2017, 7:39am

This is because of the fact that SMMU is being enabled for PCIe.
Even in TX1, if 24.2 release is used (where SMMU is enabled for PCIe), virt_to_phys() macro would not give actual physical address. TX2 anyway has SMMU enabled for PCIe by default, so, virt_to_phys() doesn’t work as expected.
The reason being virt_to_phys() macro doesn’t have idea of SMMU page tables where the actual mapping from bus_address to physical_address is present.

MartijnBerger · May 11, 2017, 8:45am

dma_mmap_attrs() is also not working for me on TX2.

What API should I be using to get this memory into user space ?

Also dma_alloc_attrs() is giving 0x8000000 on first mapping of 4M and next allocation creates an address exactly 5M from that alloaction when judgeing by dma_addr_t

yyu · July 27, 2017, 8:34am

I am the same problem with V4L2 driver for a PCI device that supports three channel videos through TX2 PCIe slot.

The first channel is OK. But the others have errors.

I used the pci_alloc_coherent to alloc the memory for DMA, it return the virt and phy address correctly.

[ 40.350211] Virt addr is 1281000, Phys addr is 80500000

But when receive data, there are some errors.

[ 48.704536] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x804e0600, fsynr=0x40013, cb=21, sid=17(0x11 - AFI), pgd=26ff63003, pud=26ff63003, pmd=24d572003, pte=0

Anyone knows how to fix/workaround this issue?

zhao69623 · February 13, 2018, 4:57pm

Hello Vidyas

On TX2 board, r28.1. PCIE device using MSI interrupt. kernel report error:

arm-smmu 12000000.iommu: Unhandled context fault: iova=0xebe40000, fsynr=0x60001, cb=22, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0
arm-smmu 12000000.iommu: Unhandled context fault: iova=0x00000000, fsynr=0x60011, cb=22, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0

after I apply patch:
drivers/pci/host/pci-tegra.c

  msi->pages = __get_free_pages(GFP_DMA32, 0);

  msi->pages = __get_free_pages(GFP_DMA, 0);

rebuild kernel and driver and reboot TX2 and running our PCIE device driver with MSI interrupt coming in.
but the arm-smmu error is still there.

Is there other patch we have to apply for our driver case ?

albertr · May 31, 2018, 5:13pm

We are also getting “arm-smmu 12000000.iommu: Unhandled context fault: iova” errors when trying to access GPU memory about 4GB limit on TX2 with 28.1 L4T kernel. I believe there might be other bugs in the kernel with improper 64bit addressing. The problem can be easily replicated running MemtestG80 (https://github.com/ihaque/memtestG80.git) on TX2, for instance

Attempt to allocate 4100 MB of RAM works fine, but 4200 is failing with the errors below:

#./memtestG80 4100 1

Final error count after 1 iterations over 4100 MiB of GPU memory: 4294967181 errors:

[396634.006720] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x84209100, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=1dbd28003, pte=0
[396634.021507] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x87e8ce40, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=180276003, pte=0
[396634.036266] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x8bb43440, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=180258003, pte=0
[396634.051014] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x0f621780, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=226781003, pte=0
[396634.065512] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x9334d880, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=18021c003, pte=0
[396634.080276] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x96ff3d40, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=1801fe003, pte=0
[396634.095038] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x1aaec5c0, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=226727003, pte=0
[396634.109566] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x9e837bc0, fsynr=0x13, cb=19, sid=16(0x10 - GPU), pgd=1dc2c9003, pud=1dc2c9003, pmd=1802b8003, pte=0
[396634.124361] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x22360340, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=2266eb003, pte=0
[396634.138900] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x25efa5c0, fsynr=0x3, cb=19, sid=16(0x10 - GPU), pgd=faf57003, pud=faf57003, pmd=2267ae003, pte=0

It’s hard to believe that NVidia didn’t test GPU memory allocations above 4GB on TX2 which is equipped with 8GB… dmesg snippet with errors is attached.

Any suggestions on how it can be solved? Do we need to apply another kernel patch to address it?

-albertr

dmesg.txt (34.7 KB)

albertr · June 1, 2018, 4:45pm

vidyas,

Can you help with this issue? It can easily be reproduced by following the steps below:

Clone MemtestG80 from [url]https://github.com/ihaque/memtestG80.git[/url]
Compile it for sm_62 (TX2) GPU type, i.e.:

root@tx2:/usr/src/memtestG80/Makefiles# cat Makefile.linux64
ZLIB_DIR=zlib/linux64
CUDA_DIR=/usr/local/cuda/lib64
CFLAGS=-DLINUX -O -Wall -m64
NVCCFLAGS=-DLINUX -L$(ZLIB_DIR) -L$(CUDA_DIR) -O -Xptxas -v -Xcompiler -Wall -m64 -gencode arch=compute_62,code=sm_62
NVCC=nvcc
CXX=g++

all: memtestG80

clean:
rm -f *.o
rm -f memtestG80

memtestG80_core.o: memtestG80_core.cu memtestG80_core.h
$(NVCC) -c $(NVCCFLAGS) -o memtestG80_core.o memtestG80_core.cu

memtestG80: memtestG80_core.o memtestG80_cli.cu
$(NVCC) $(NVCCFLAGS) -o memtestG80 memtestG80_core.o memtestG80_cli.cu

Run MemtestG80 with > 4GB, i.e.:

root@tx2:/usr/src/memtestG80/Makefiles# ./memtestG80 4200 1

Observe memory allocation failures in MemtestG80 and errors in kernel logs.

We really need to get this issue resolved. We are using TX2 and L4T 28.1 kernel. Please let us know.

-albertr

vidyas · June 21, 2018, 6:08am

I’ll check and update in some time

ioezhanggang · July 3, 2018, 2:54am

Any news? I also encounted the problem albertr mentioned, and jetson_clocks.sh failed to work, ends with error:
/home/nvidia/jetson_clocks.sh: line 299: /sys/kernel/debug/bpmp/debug/clk/emc/rate: No such file or directory
/home/nvidia/jetson_clocks.sh: line 307: /sys/kernel/debug/bpmp/debug/clk/emc/mrq_rate_locked: No such file or directory

ioezhanggang · July 3, 2018, 1:00pm

Finally, the problem solved. Please refer this post:

Topic		Replies	Views
Altera FPGA DMA to TX2 via PCIe problem Jetson TX2	18	3610	October 18, 2021
What is the current status of PCIe DMA? Jetson TX1	21	8086	October 18, 2021
iommu unhandled context fault on reserved memory Jetson TX2	3	1886	October 18, 2021
How to map kernel memory to user memory? Jetson TX2	47	9026	October 18, 2021
"dma_alloc_coherent" return memory address can't to use for "virt_to_page" on Jetpack 3.2 TX1 kernel Jetson TX1	7	1966	October 18, 2021
IOMMU: Unhandled context fault Jetson TX2	8	3141	January 24, 2020
Iommu unhandled context fault during DMA transfer Jetson TX2 pcie	9	1758	October 18, 2021
PCIe-AXI DMA error after migration from r23.2 to r24.2 Jetson TX1	18	3061	October 18, 2021
pcie driver fails after moving to latest l4t Jetson TX2	8	1921	October 18, 2021
NVMe triggers IOMMU faults on TX2 Jetson TX2	43	6946	February 14, 2018

iommu unhandled context fault on PCI device DMA

Related topics