Jetson Orin PCIE-NTB function with DMA failed

Hi,

I’m using 2*Jetson Orin Developer Kit and a Microchip Switchtec PCIE Switch to test PCIE NTB function with ntb_perf.c and ntb_hw_switchtec.c(Switchtec PCIE driver) provided by kernel.

My BSP version is JetPack 5.0.1.

For the case of NTB without DMA, it works.
However if I try to test NTB with DMA:

sudo modprobe ntb_perf chunk_order=20 total_order=25 use_dma=1

Then when I started transfer, i got EMEM address decode error like below:

d@d-desktop:/sys/kernel/debug/ntb_perf/0005:01:00.0$ sudo echo 0 > run
[  150.540694] mc-err: (255) csw_axisw: EMEM address decode error
[  150.540991] mc-err:   status = 0x2001008d; hi_addr_reg = 0x00000027 addr = 0x2740400400
[  150.541396] mc-err:   secure: no, access-type: write
[  150.617862] mc-err: (255) csw_axisw: EMEM address decode error
[  150.619921] mc-err:   status = 0x2001008d; hi_addr_reg = 0x00000027 addr = 0x2740400600
[  150.620362] mc-err:   secure: no, access-type: write
[  150.626044] irq 14: nobody cared (try booting with the "irqpoll" option)
[  150.626288] handlers:
[  150.626355] [<0000000068f128fa>] tegra_mcerr_hard_irq threaded [<00000000123ec76b>] tegra_mcerr_thread
[  150.626598] Disabling IRQ #14
[  150.626706] mc-err: (255) csw_axisw: EMEM address decode error
[  150.626862] mc-err:   status = 0x2001008d; hi_addr_reg = 0x00000027 addr = 0x2740400000
[  150.627074] mc-err:   secure: no, access-type: write
[  150.627216] mc-err: (255) csw_axisw: EMEM address decode error
[  150.627372] mc-err:   status = 0x2001008d; hi_addr_reg = 0x00000027 addr = 0x2740400200
[  150.627579] mc-err:   secure: no, access-type: write
[  150.728606] mc-err: Too many MC errors; throttling prints
^C[  179.420942] WARNING: CPU: 1 PID: 125 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.421486] ---[ end trace 4ee791a0b2d23248 ]---
[  179.421668] WARNING: CPU: 1 PID: 125 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.422028] ---[ end trace 4ee791a0b2d23249 ]---
[  179.422514] WARNING: CPU: 3 PID: 7 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.423004] ---[ end trace 4ee791a0b2d2324a ]---
[  179.423145] WARNING: CPU: 3 PID: 7 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.423515] ---[ end trace 4ee791a0b2d2324b ]---
[  179.423786] WARNING: CPU: 10 PID: 561 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.424216] ---[ end trace 4ee791a0b2d2324c ]---
[  179.424374] WARNING: CPU: 10 PID: 561 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.424715] ---[ end trace 4ee791a0b2d2324d ]---
[  179.424896] WARNING: CPU: 10 PID: 614 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.426872] ---[ end trace 4ee791a0b2d2324e ]---
[  179.427078] WARNING: CPU: 10 PID: 614 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.436111] ---[ end trace 4ee791a0b2d2324f ]---
[  179.441455] WARNING: CPU: 5 PID: 1834 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.450438] ---[ end trace 4ee791a0b2d23250 ]---
[  179.454958] WARNING: CPU: 5 PID: 1834 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.464068] ---[ end trace 4ee791a0b2d23251 ]---
[  179.468845] WARNING: CPU: 6 PID: 1835 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.478324] ---[ end trace 4ee791a0b2d23252 ]---
[  179.482872] WARNING: CPU: 6 PID: 1835 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.491944] ---[ end trace 4ee791a0b2d23253 ]---
[  179.497313] WARNING: CPU: 4 PID: 1836 at drivers/iommu/io-pgtable-arm.c:593 __arm_lpae_unmap+0x380/0x490
[  179.506245] ---[ end trace 4ee791a0b2d23254 ]---
[  179.510794] WARNING: CPU: 4 PID: 1836 at drivers/iommu/dma-iommu.c:507 __iommu_dma_unmap+0xf8/0x110
[  179.519860] ---[ end trace 4ee791a0b2d23255 ]---

It seems the initialization of my NTB drivers with DMA worked fine on BOTH side:

d@d-desktop:/sys/kernel/debug/ntb_perf/0005:01:00.0$ sudo cat info
    Performance measuring tool info:

Local port 0, Global index 0
Test status: idle
Port 0 (0), Global index 0:
        Link status: up
        Out buffer addr 0xffff800022400000
        Out buff phys addr 0x0000002740400000[p]
        Out buffer size 0x0000000000400000
        Out buffer xlat 0x00000000ff800000[p]
        In buffer addr 0xffff8000209ff000
        In buffer size 0x0000000000400000
        In buffer xlat 0x00000000ff800000[p]

And all of the source codes(including JetPack 5.0.1 and all drivers) are exactly the same on both side.

Sorry for the late response, our team will do the investigation and provide suggestions soon. Thanks

Hi,

For the smmu issue, please make sure

Tegra RP → Endpoint: PCIe client driver(like NVMe) initiates EP’s DMA read from Tegra system memory to Endpoint BAR.
Thus, please check if SRC is system memory IOVA and DST is EP’s BAR. The EP’s BAR address should be checked by lspci -vvv |grep region

And for Endpoint → Tegra RP: Endpoint firmware initiates EP’s DMA write towards Tegra system memory. So, DST should is Tegra system memory IOVA.

Please make sure the addresses are correct first.

Thanks for your reply.

I am using NTB so there is no EP but 2 hosts which are both Jetson Orin. They are running the same code of kernel and PCIE driver.

lspci -vvv |grep region:

My BAR0 and BAR2 is set by PCIE Switch as 4MB and 64MB, which corresponding to the memory address at 2744000000 and 2740000000.

The first Orin:

0005:01:00.0 Bridge: PMC-Sierra Inc. Device 4100
        Subsystem: PMC-Sierra Inc. Device 4100
        Flags: bus master, fast devsel, latency 0
        Memory at 2744000000 (64-bit, prefetchable) [size=4M]
        Memory at 2740000000 (64-bit, prefetchable) [size=64M]
        Capabilities: [40] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [50] MSI-X: Enable+ Count=4 Masked-
        Capabilities: [5c] Power Management version 3
        Capabilities: [64] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Power Budgeting <?>
        Capabilities: [158] Multicast
        Capabilities: [188] Secondary PCI Express
        Capabilities: [1b4] Device Serial Number 50-0e-00-4a-00-00-00-01
        Capabilities: [1c0] Data Link Feature <?>
        Capabilities: [1cc] Physical Layer 16.0 GT/s <?>
        Capabilities: [20c] Lane Margining at the Receiver <?>
        Capabilities: [7f8] Vendor Specific Information: ID=ffff Rev=1 Len=808 <?>
        Kernel driver in use: switchtec
        Kernel modules: switchtec

The second Orin:

0005:01:00.1 Bridge: PMC-Sierra Inc. Device 4100
        Subsystem: PMC-Sierra Inc. Device 4100
        Flags: bus master, fast devsel, latency 0
        Memory at 2744000000 (64-bit, prefetchable) [size=4M]
        Memory at 2740000000 (64-bit, prefetchable) [size=64M]
        Capabilities: [40] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [50] MSI-X: Enable+ Count=4 Masked-
        Capabilities: [5c] Power Management version 3
        Capabilities: [64] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Multicast
        Capabilities: [178] Device Serial Number 50-0e-00-4a-00-00-00-01
        Capabilities: [7f8] Vendor Specific Information: ID=ffff Rev=1 Len=808 <?>
        Kernel driver in use: switchtec
        Kernel modules: switchtec

In the ntb_perf.c, I can find the dma_map_resource api:

dma_map_resource(pthr->dma_chan->device->dev,
				 peer->out_phys_addr, peer->outbuf_size,
				 DMA_FROM_DEVICE, 0);

According to debugging, the parameter out_phys_addr is 2740400000, outbuf_size is 400000.

BTW, how can I check the actual source and destination address of DMA?

You should be able to able to get the IOVA by using dma_alloc_coherent()/dma_map_single/etc.
What do you mean actual address? Do you mean what is getting in use by DMA engine but not upper level API?

In ntb_perf.c,

peer->inbuf = dma_alloc_coherent(&perf->ntb->pdev->dev,
					 peer->inbuf_size, &peer->inbuf_xlat,
					 GFP_KERNEL);

and I get inbuf like 0xffff8000209ff000.
The ntb_perf info is shown below:

d@d-desktop:/sys/kernel/debug/ntb_perf/0005:01:00.0$ sudo cat info
    Performance measuring tool info:

Local port 0, Global index 0
Test status: idle
Port 0 (0), Global index 0:
        Link status: up
        Out buffer addr 0xffff800022400000
        Out buff phys addr 0x0000002740400000[p]
        Out buffer size 0x0000000000400000
        Out buffer xlat 0x0000000000000000[p]
        In buffer addr 0xffff8000209ff000
        In buffer size 0x0000000000400000
        In buffer xlat 0x00000000ff800000[p]

Addition:
I can get debug message after

	peer->dma_dst_addr =
		dma_map_resource(pthr->dma_chan->device->dev,
				 peer->out_phys_addr, peer->outbuf_size,
				 DMA_FROM_DEVICE, 0);
tegra-gpcdma 2600000.gpcdma: 0: Map MMIO 0x0000002740400000 to DMA addr 0x00000000ff800000

Does it mean that DMA has already set correctly?

Hi,

According to the mc-err, the issue is your switch is writing to wrong address.

Out buff phys addr is BAR MMIO address(0x2740400000).
In buffer xlat is input buffer owned by Tegra which is IOVA=0x00000000ff800000

In this case, tegra should get 0x00000000ff800000 from switch but not 0x2740400400.

Please check this on the switch first.

For your suggestion, I’m not quite sure about whether the address mapping is right. But, I would like to show the phenomenon below,

This is my code in ntb_perf.c:

	printk("Before dma_map_page src=%pK\n", src);
	unmap->addr[0] = dma_map_page(dma_dev, virt_to_page(src),
		offset_in_page(src), len, DMA_TO_DEVICE);
	printk("Before dma_mapping_error\n");
	if (dma_mapping_error(dma_dev, unmap->addr[0])) {
		ret = -EIO;
		goto err_free_resource;
	}
	unmap->to_cnt = 1;

	printk("Before dmaengine_prep_dma_memcpy: dst_dma_addr=%llx, unmap->addr[0]=%llx\n", 
		dst_dma_addr, unmap->addr[0]);
	do {
		tx = dmaengine_prep_dma_memcpy(pthr->dma_chan, dst_dma_addr,
			unmap->addr[0], len, DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
		if (!tx)
			msleep(DMA_MDELAY);
	} while (!tx && (try++ < DMA_TRIES));

And below is the dmesg info:

[  163.672435] tegra-gpcdma 2600000.gpcdma: 0: Map MMIO 0x0000002740400000 to DMA addr 0x00000000ff800000
[  163.672441] Before perf_copy_chunk: flt_dst=ffff800022400000
[  163.672445] flt_src=ffff7c853a400000
[  163.672449] chunk_size=80000
[  163.672458] Before dma_map_page src=ffff7c853a400000
[  163.672474] Before dma_mapping_error
[  163.672476] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=ff780000
[  163.672491] Before dma_map_page src=ffff7c853a480000
[  163.672505] Before dma_mapping_error
[  163.672507] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=ff700000
[  163.672510] Before dma_map_page src=ffff7c853a500000
[  163.672522] Before dma_mapping_error
[  163.672524] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=ff680000
[  163.672528] Before dma_map_page src=ffff7c853a580000
[  163.672542] Before dma_mapping_error
[  163.672543] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=ff600000
[  163.672547] Before dma_map_page src=ffff7c853a600000
[  163.672562] Before dma_mapping_error
[  163.672564] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=ff580000
[  163.672566] Before dma_map_page src=ffff7c853a680000
[  163.672581] Before dma_mapping_error
[  163.672582] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=ff500000
[  163.672586] Before dma_map_page src=ffff7c853a700000
[  163.672599] Before dma_mapping_error
[  163.672600] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=ff480000
[  163.672608] Before dma_map_page src=ffff7c853a780000
[  163.672622] Before dma_mapping_error
[  163.672623] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=ff400000
[  163.672628] Before dma_map_page src=ffff7c853a400000
[  163.672643] Before dma_mapping_error
[  163.672645] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=ff380000
[  163.672648] Before dma_map_page src=ffff7c853a480000
[  163.672661] Before dma_mapping_error
[  163.672663] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=ff300000
[  163.672666] Before dma_map_page src=ffff7c853a500000
[  163.672679] Before dma_mapping_error
[  163.672681] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=ff280000
[  163.672684] Before dma_map_page src=ffff7c853a580000
[  163.672698] Before dma_mapping_error
[  163.672700] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=ff200000
[  163.672702] Before dma_map_page src=ffff7c853a600000
[  163.672716] Before dma_mapping_error
[  163.672717] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=ff180000
[  163.672720] Before dma_map_page src=ffff7c853a680000
[  163.672734] Before dma_mapping_error
[  163.672735] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=ff100000
[  163.672737] Before dma_map_page src=ffff7c853a700000
[  163.672751] Before dma_mapping_error
[  163.672752] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=ff080000
[  163.672756] Before dma_map_page src=ffff7c853a780000
[  163.672769] Before dma_mapping_error
[  163.672770] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=ff000000
[  163.672773] Before dma_map_page src=ffff7c853a400000
[  163.672788] Before dma_mapping_error
[  163.672789] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=fef80000
[  163.672791] Before dma_map_page src=ffff7c853a480000
[  163.672805] Before dma_mapping_error
[  163.672806] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=fef00000
[  163.672808] Before dma_map_page src=ffff7c853a500000
[  163.672821] Before dma_mapping_error
[  163.672822] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=fee80000
[  163.672826] Before dma_map_page src=ffff7c853a580000
[  163.672839] Before dma_mapping_error
[  163.672840] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=fee00000
[  163.672842] Before dma_map_page src=ffff7c853a600000
[  163.672857] Before dma_mapping_error
[  163.672858] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=fed80000
[  163.672860] Before dma_map_page src=ffff7c853a680000
[  163.672873] Before dma_mapping_error
[  163.672875] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=fed00000
[  163.672877] Before dma_map_page src=ffff7c853a700000
[  163.672890] Before dma_mapping_error
[  163.672891] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=fec80000
[  163.672894] Before dma_map_page src=ffff7c853a780000
[  163.672908] Before dma_mapping_error
[  163.672910] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=fec00000
[  163.672912] Before dma_map_page src=ffff7c853a400000
[  163.672926] Before dma_mapping_error
[  163.672927] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=feb80000
[  163.672930] Before dma_map_page src=ffff7c853a480000
[  163.672944] Before dma_mapping_error
[  163.672945] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=feb00000
[  163.672947] Before dma_map_page src=ffff7c853a500000
[  163.672961] Before dma_mapping_error
[  163.672963] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=fea80000
[  163.672966] Before dma_map_page src=ffff7c853a580000
[  163.672979] Before dma_mapping_error
[  163.672980] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=fea00000
[  163.672983] Before dma_map_page src=ffff7c853a600000
[  163.672999] Before dma_mapping_error
[  163.673001] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=fe980000
[  163.673004] Before dma_map_page src=ffff7c853a680000
[  163.673020] Before dma_mapping_error
[  163.673021] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=fe900000
[  163.673023] Before dma_map_page src=ffff7c853a700000
[  163.673036] Before dma_mapping_error
[  163.673037] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=fe880000
[  163.673040] Before dma_map_page src=ffff7c853a780000
[  163.673054] Before dma_mapping_error
[  163.673055] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=fe800000
[  163.811067] irq 14: nobody cared (try booting with the "irqpoll" option)
[  163.811266] CPU: 0 PID: 81 Comm: irq/14-mc_statu Tainted: G           O      5.10.65-tegra #12
[  163.811269] Hardware name:  /, BIOS v1.1.2-0165e807 03/15/2022
[  163.811271] Call trace:
[  163.811282]  dump_backtrace+0x0/0x1a0
[  163.811285]  show_stack+0x2c/0x40
[  163.811294]  dump_stack+0xd8/0x138
[  163.811296]  __report_bad_irq+0x54/0xe0
[  163.811300]  note_interrupt+0x2d4/0x3a0
[  163.811305]  handle_irq_event_percpu+0x8c/0xa0
[  163.811308]  handle_irq_event+0x4c/0xf0
[  163.811310]  handle_fasteoi_irq+0xbc/0x170
[  163.811313]  generic_handle_irq+0x3c/0x60
[  163.811315]  __handle_domain_irq+0x6c/0xc0
[  163.811317]  gic_handle_irq+0x64/0x130
[  163.811319]  el1_irq+0xd0/0x180
[  163.811358]  recv_func_prehandle+0x0/0x90 [rtl8822ce]
[  163.811385]  rtw_recv_entry+0x28/0x64 [rtl8822ce]
[  163.811407]  pre_recv_entry+0xfc/0x150 [rtl8822ce]
[  163.811428]  rtl8822ce_tx_isr+0x41c/0x73c [rtl8822ce]
[  163.811433]  tasklet_action_common.isra.0+0x15c/0x180
[  163.811435]  tasklet_hi_action+0x2c/0x40
[  163.811437]  __do_softirq+0x138/0x3e0
[  163.811440]  irq_exit+0xc0/0xe0
[  163.811443]  __handle_domain_irq+0x70/0xc0
[  163.811444]  gic_handle_irq+0x64/0x130
[  163.811446]  el1_irq+0xd0/0x180
[  163.811450]  _raw_spin_unlock_irqrestore+0x5c/0x70
[  163.811454]  of_find_property+0x5c/0x80
[  163.811457]  is_tegra_safety_build+0x28/0x40
[  163.811459]  log_mcerr_fault+0x20/0xbb0
[  163.811461]  tegra_mcerr_thread+0xd0/0x110
[  163.811464]  irq_thread_fn+0x30/0xa0
[  163.811466]  irq_thread+0x150/0x250
[  163.811469]  kthread+0x148/0x170
[  163.811471]  ret_from_fork+0x10/0x18
[  163.811472] handlers:
[  163.811536] [<000000007abe6ee9>] tegra_mcerr_hard_irq threaded [<000000002d3d6296>] tegra_mcerr_thread
[  163.811791] Disabling IRQ #14
[  163.812056] mc-err: (255) csw_axisw: EMEM address decode error
[  163.812232] mc-err:   status = 0x2001008d; hi_addr_reg = 0x00000027 addr = 0x2740400400
[  163.812477] mc-err:   secure: no, access-type: write
[  163.812637] mc-err: (255) csw_axisw: EMEM address decode error
......

It seems like the DMA transfer has already be started correctly.
However, every time I trigger the transfer, it can only cover source address from 0xff800000 (the log shows 0xff780000 because my chunk size is 80000) to 0xfe800000.

Why does it happen? Is there any wrong with dma_map_page?

unmap->addr[0] = dma_map_page(dma_dev, virt_to_page(src),
		offset_in_page(src), len, DMA_TO_DEVICE);

Add:

In my driver code

tx = dmaengine_prep_dma_memcpy(pthr->dma_chan, dst_dma_addr,
			unmap->addr[0], len, DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
  1. dst_dma_addr comes from:
peer->dma_dst_addr =
		dma_map_resource(pthr->dma_chan->device->dev,
				 peer->out_phys_addr, peer->outbuf_size,
				 DMA_FROM_DEVICE, 0);

which located in 0xFF800000:

[  104.596099] After dma_map_resource: peer->out_phys_addr = 168581660672, peer->outbuf_size = 4194304, peer->dma_dst_addr = 4286578688

  1. unmap->addr[0] comes from:
	unmap->addr[0] = dma_map_page(dma_dev, virt_to_page(src),
		offset_in_page(src), len, DMA_TO_DEVICE);

the src here comes from:

	pthr->src = kmalloc_node(perf->test_peer->outbuf_size, GFP_KERNEL,
				 dev_to_node(&perf->ntb->dev));

unmap->addr[0] is located in 0xff780000 for the first transfer:

[  104.596131] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=ff780000

In conclusion, i think the mapping route is right in my driver.

src buffer ==dma map== addr[0] ----transfer---- dst_dma_addr ==dma map== out_phys_addr(0x2740400000)

btw. Maybe DMA cannot access 64bit address?

Does this error always happen after this?
[ 163.673055] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=fe800000

Yes, always. But the specific address depends on my initialization.

Update:

I have changed

	peer->dma_dst_addr =
		dma_map_resource(pthr->dma_chan->device->dev,
				 peer->out_phys_addr, peer->outbuf_size,
				 DMA_FROM_DEVICE, 0);

to

	peer->dma_dst_addr =
		dma_map_resource(pthr->dma_chan->device->dev,
				 peer->out_phys_addr, peer->outbuf_size,
				 DMA_TO_DEVICE, 0);

and I got something new in dmesg:

[   97.891924] tegra-gpcdma 2600000.gpcdma: 0: Map MMIO 0x0000002740400000 to DMA addr 0x00000000ff800000
[   97.891934] Before perf_copy_chunk: flt_dst=ffff800022400000
[   97.891937] flt_src=ffff7d86b6000000
[   97.891939] chunk_size=80000
[   97.891946] Before dma_map_page src=ffff7d86b6000000
[   97.891966] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=ff780000
[   97.892000] Before dma_map_page src=ffff7d86b6080000
[   97.892011] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=ff700000
[   97.892018] Before dma_map_page src=ffff7d86b6100000
[   97.892033] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff800000, fsynr=0x360013, cbfrsynra=0x804, cb=0
[   97.892036] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=ff680000
[   97.892048] Before dma_map_page src=ffff7d86b6180000
[   97.892424] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=ff600000
[   97.892432] Before dma_map_page src=ffff7d86b6200000
[   97.892450] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=ff580000
[   97.892455] Before dma_map_page src=ffff7d86b6280000
[   97.892465] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff801000, fsynr=0x1d0013, cbfrsynra=0x804, cb=0
[   97.892468] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=ff500000
[   97.892473] Before dma_map_page src=ffff7d86b6300000
[   97.892836] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=ff480000
[   97.892840] Before dma_map_page src=ffff7d86b6380000
[   97.892849] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=ff400000
[   97.892858] Before dma_map_page src=ffff7d86b6000000
[   97.892872] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=ff380000
[   97.892878] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff807000, fsynr=0x360013, cbfrsynra=0x404, cb=0
[   97.892882] Before dma_map_page src=ffff7d86b6080000
[   97.893250] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=ff300000
[   97.893257] Before dma_map_page src=ffff7d86b6100000
[   97.893267] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=ff280000
[   97.893275] Before dma_map_page src=ffff7d86b6180000
[   97.893285] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=ff200000
[   97.893290] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff80d000, fsynr=0x1d0013, cbfrsynra=0xc04, cb=0
[   97.893294] Before dma_map_page src=ffff7d86b6200000
[   97.893636] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=ff180000
[   97.893643] Before dma_map_page src=ffff7d86b6280000
[   97.893653] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=ff100000
[   97.893656] Before dma_map_page src=ffff7d86b6300000
[   97.893665] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=ff080000
[   97.893669] Before dma_map_page src=ffff7d86b6380000
[   97.893674] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff813000, fsynr=0x360013, cbfrsynra=0xc04, cb=0
[   97.893680] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=ff000000
[   97.894005] Before dma_map_page src=ffff7d86b6000000
[   97.894018] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=fef80000
[   97.894021] Before dma_map_page src=ffff7d86b6080000
[   97.894033] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=fef00000
[   97.894044] Before dma_map_page src=ffff7d86b6100000
[   97.894051] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff819000, fsynr=0x1d0013, cbfrsynra=0x404, cb=0
[   97.894056] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=fee80000
[   97.894384] Before dma_map_page src=ffff7d86b6180000
[   97.894395] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=fee00000
[   97.894401] Before dma_map_page src=ffff7d86b6200000
[   97.894414] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=fed80000
[   97.894420] Before dma_map_page src=ffff7d86b6280000
[   97.894429] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff81f000, fsynr=0x360013, cbfrsynra=0x804, cb=0
[   97.894432] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=fed00000
[   97.894437] Before dma_map_page src=ffff7d86b6300000
[   97.895831] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=fec80000
[   97.895836] Before dma_map_page src=ffff7d86b6380000
[   97.895846] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=fec00000
[   97.895851] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff823680, fsynr=0x1d0013, cbfrsynra=0x404, cb=0
[   97.895854] Before dma_map_page src=ffff7d86b6000000
[   97.895865] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff800000, unmap->addr[0]=feb80000
[   97.897568] Before dma_map_page src=ffff7d86b6080000
[   97.897577] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff880000, unmap->addr[0]=feb00000
[   97.897580] Before dma_map_page src=ffff7d86b6100000
[   97.897589] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff900000, unmap->addr[0]=fea80000
[   97.897592] Before dma_map_page src=ffff7d86b6180000
[   97.897602] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ff980000, unmap->addr[0]=fea00000
[   97.897604] Before dma_map_page src=ffff7d86b6200000
[   97.897610] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff83e100, fsynr=0x360013, cbfrsynra=0x404, cb=0
[   97.897615] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa00000, unmap->addr[0]=fe980000
[   97.908201] Before dma_map_page src=ffff7d86b6280000
[   97.908211] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffa80000, unmap->addr[0]=fe900000
[   97.908214] Before dma_map_page src=ffff7d86b6300000
[   97.908223] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb00000, unmap->addr[0]=fe880000
[   97.908226] Before dma_map_page src=ffff7d86b6380000
[   97.908234] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000408, iova=0xff879000, fsynr=0x1d0013, cbfrsynra=0x4, cb=0
[   97.908237] Before dmaengine_prep_dma_memcpy: dst_dma_addr=ffb80000, unmap->addr[0]=fe800000
[   98.041969] irq 14: nobody cared (try booting with the "irqpoll" option)
[   98.042161] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O      5.10.65-tegra #12
[   98.042163] Hardware name:  /, BIOS v1.1.2-0165e807 03/15/2022
[   98.042166] Call trace:
[   98.042177]  dump_backtrace+0x0/0x1a0
[   98.042180]  show_stack+0x2c/0x40
[   98.042191]  dump_stack+0xd8/0x138
[   98.042193]  __report_bad_irq+0x54/0xe0
[   98.042198]  note_interrupt+0x2d4/0x3a0
[   98.042204]  handle_irq_event_percpu+0x8c/0xa0
[   98.042206]  handle_irq_event+0x4c/0xf0
[   98.042208]  handle_fasteoi_irq+0xbc/0x170
[   98.042211]  generic_handle_irq+0x3c/0x60
[   98.042213]  __handle_domain_irq+0x6c/0xc0
[   98.042215]  gic_handle_irq+0x64/0x130
[   98.042216]  el1_irq+0xd0/0x180
[   98.042219]  skip_ftrace_call+0x0/0x30
[   98.042222]  __rcu_read_unlock+0x14/0x190
[   98.042226]  sk_filter_trim_cap+0xd0/0x250
[   98.042229]  tcp_v4_rcv+0xa8c/0xc90
[   98.042232]  ip_protocol_deliver_rcu+0x44/0x200
[   98.042234]  ip_local_deliver_finish+0x68/0x80
[   98.042236]  ip_local_deliver+0x80/0x130
[   98.042237]  ip_rcv_finish+0x94/0xb0
[   98.042239]  ip_rcv+0x64/0x110
[   98.042242]  __netif_receive_skb_one_core+0x64/0x90
[   98.042243]  __netif_receive_skb+0x28/0x70
[   98.042245]  netif_receive_skb+0x40/0x1f0
[   98.042248]  br_netif_receive_skb+0x3c/0x60
[   98.042250]  br_pass_frame_up+0xd8/0x190
[   98.042252]  br_handle_frame_finish+0x2b0/0x420
[   98.042254]  br_handle_frame+0x238/0x380
[   98.042255]  __netif_receive_skb_core+0x580/0xe50
[   98.042257]  __netif_receive_skb_one_core+0x48/0x90
[   98.042259]  __netif_receive_skb+0x28/0x70
[   98.042260]  process_backlog+0xbc/0x1a0
[   98.042261]  net_rx_action+0x120/0x430
[   98.042263]  __do_softirq+0x138/0x3e0
[   98.042266]  irq_exit+0xc0/0xe0
[   98.042268]  __handle_domain_irq+0x70/0xc0
[   98.042270]  gic_handle_irq+0x64/0x130
[   98.042271]  el1_irq+0xd0/0x180
[   98.042274]  cpuidle_enter_state+0xb4/0x400
[   98.042275]  cpuidle_enter+0x3c/0x50
[   98.042279]  call_cpuidle+0x40/0x70
[   98.042280]  do_idle+0x1fc/0x260
[   98.042281]  cpu_startup_entry+0x2c/0x70
[   98.042284]  rest_init+0xd8/0xe4
[   98.042288]  arch_call_rest_init+0x14/0x1c
[   98.042290]  start_kernel+0x4c0/0x4f4
[   98.042291] handlers:
[   98.042358] [<0000000088f5d128>] tegra_mcerr_hard_irq threaded [<00000000d68a947a>] tegra_mcerr_thread
[   98.042614] Disabling IRQ #14
[   98.042801] mc-err: vpr base=0:d6000000, size=2a0, ctrl=1, override:(201803c6, b9ee11c1, 1, 0)
[   98.043079] mc-err: (255) csw_axisw: MC request violates VPR requirements

Message from syslogd@d-desktop at Aug 24 17:09:47 ...
 kernel:[   98.042614] Disabling IRQ #14
[   98.043272] mc-err:   status = 0x0ff7408d; hi_addr_reg = 0x00000000 addr = 0xffffffff00
[   98.043483] mc-err:   secure: yes, access-type: write
[   98.146569] mc-err: mcerr: unknown intr source intstatus = 0x00000000, intstatus_1 = 0x00000000
[   98.250622] mc-err: mcerr: unknown intr source intstatus = 0x00000000, intstatus_1 = 0x00000000
[   98.354694] mc-err: mcerr: unknown intr source intstatus = 0x00000000, intstatus_1 = 0x00000000
[   98.458722] mc-err: Too many MC errors; throttling prints
[  242.653317] INFO: task kworker/u24:0:7 blocked for more than 120 seconds.
[  242.653643]       Tainted: G           O      5.10.65-tegra #12
[  242.653870] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.654152] task:kworker/u24:0   state:D stack:    0 pid:    7 ppid:     2 flags:0x00000028
[  242.654215] Workqueue: perf_wq perf_thread_work [ntb_perf]
[  242.654228] Call trace:
[  242.654272]  __switch_to+0x104/0x160
[  242.654299]  __schedule+0x3d0/0x900
[  242.654302]  schedule+0x74/0x100
[  242.654307]  perf_thread_work+0x480/0x760 [ntb_perf]
[  242.654325]  process_one_work+0x1c0/0x4a0
[  242.654327]  worker_thread+0x50/0x420
[  242.654330]  kthread+0x148/0x170
[  242.654335]  ret_from_fork+0x10/0x18

Hi Wayne,

Is there any update?

Recently I found that the BAR address of my device PCIE switch 0x2740400000 is mapped with virtual address ffff800022400000 by devm_ioremap_wc. It seems the ioremap function is based on vmalloc which is not suitable for DMA because the physical address applied by vmalloc is not consecutive.

Will that be the reason?

Hi,

Could you try to share your full code but not just keep sharing snippet? Also, point out which line is causing the error.

It is hard to check your situation by just reading a code snippet.

Does the error happen when your write from jetson to the switch or when the data goes from switch to jetson?

Hi Wayne,

Thanks for reply.

Here are full codes of ntb_perf.c and ntb_hw_switchtec.c, they can also be found in JetPack 5.0.1.
ntb_perf.c (45.9 KB)
ntb_hw_switchtec.c (47.0 KB)

In line 796 of ntb_perf.c, if i don’t use dma, then it work fine to copy data through memcpy into BAR address. If enable dma, I’m not sure which line cause the error, it seems that the program in ntb_perf.c does not report any error, but the kernel stucked.

Hi,

Below picture is the overall structure of this setup

DMA should target BAR when transferring from RP → EP, this is same for RP1<-NTB switch->RP2 as well. Here, address which caused SMMU fault: 0x0000002740400000, falls in 64MB of BAR2 (0x2740000000-0x2744000000).

In this figure you can see that Tegra needs SMMU IOVA addr as DMA address. This conversion of BAR to IOVA should be done by switch.

Tegra should get 0x00000000ff800000 from switch, but not 0x2740400400.

Your previous error shows 0x2740400400. Which means it has something wrong when write back from switch.

[  150.540694] mc-err: (255) csw_axisw: EMEM address decode error
[  150.540991] mc-err:   status = 0x2001008d; hi_addr_reg = 0x00000027 addr = 0x2740400400
[  150.541396] mc-err:   secure: no, access-type: write

Hi Wayne,

Thanks for reply.

I understood what you said about address translation. In line 981 in ntb_perf.c I sent in last reply, there is:

peer->dma_dst_addr =
		dma_map_resource(pthr->dma_chan->device->dev,
				 peer->out_phys_addr, peer->outbuf_size,
				 DMA_FROM_DEVICE, 0);

And in line 885, the dma handle provided to dmaengine_prep_dma_memcpy is based on peer->dma_dst_addr which is exactly 0xff800000 according to the print in line 1005.

Hi Wayne,

Is there any update? I’m still confused about this problem.

I have tried to use the same code (ntb_perf.c and ntb_switchtec.c) on 2 x86 PC, and the dma transfer is normal. The only difference I can see apparently is the BAR address set by host is 32 bit length:

cat /sys/kernel/debug/ntb_perf/0000\:09\:00.1/info
    Performance measuring tool info:
 
Local port 0, Global index 0
Test status: idle
Port 0 (0), Global index 0:
        Link status: up
        Out buffer addr 0xffffb67283080000
        Out buff phys addr 0x00000000f4080000[p]
        Out buffer size 0x0000000000080000
        Out buffer xlat 0x0000000000000000[p]
        In buffer addr 0xffff8e53c6d00000
        In buffer size 0x0000000000080000
        In buffer xlat 0x0000000146d00000[p]

And Orin has set it as 0x2740400000 which is a 64 bit length address.

Is this the cause of this problem?

In conclusion, the question is how to use DMA to transfer data to a specific physical address such as generated by ioremap?