Fail used linux pcie test driver pci-epf-test on nano platform

Hi,
I am used r35.4.1 on orin-nano platform,I need use pcie to transmit video data over 10G size to x86 pc, So i test with the kernel pcie test driver named pcie-epf-test to test pcie(kernel-5.10\Documentation\PCI\endpoint\pci-test-howto.rst) , the pcie command is ok when i run pcitest on x86 pc,it is ok too with data tranmit not used dma, but used dma ,transmit error as blow on device:

               WRITE => Size: 102400 bytes       DMA: YES        Time: 0.040605253 seconds      Rate: 2462 KB/s
[62485.367378] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000402, iova=0x2140020000, fsynr=0x190011, cbfrsynra=0x404, cb=0
[62485.379905] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000402, iova=0xffee0000, fsynr=0x4a0003, cbfrsynra=0x404, cb=0
[62485.392351] mc-err: Too many MC errors; throttling prints
[62485.397981] 
               WRITE => Size: 102400 bytes       DMA: YES        Time: 0.030608343 seconds      Rate: 3267 KB/s
[62559.361767] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000402, iova=0x2140020000, fsynr=0x190011, cbfrsynra=0x404, cb=0
[62559.374296] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x80000402, iova=0xffee0000, fsynr=0x4a0003, cbfrsynra=0x404, cb=0
[62559.386747] mc-err: unknown mcerr fault, int_status=0x00001040, ch_int_status=0x00000000, hubc_int_status=0x00000000 sbs_int_status=0x00000000, hub_int_status=0x00000000
[62559.402389] 
               WRITE => Size: 102400 bytes       DMA: YES        Time: 0.040632888 seconds      Rate: 2461 KB/s
[64981.928914] tegra194-pcie 14160000.pcie_ep: LTSSM state: 0xd8 timeout: -110
[65103.166103] tegra194-pcie 14160000.pcie_ep: LTSSM state: 0xc8 timeout: -110
[65444.326018] 
               WRITE => Size: 102400 bytes       DMA: NO         Time: 0.000830611 seconds      Rate: 120393 KB/s
[65490.296965] 
               WRITE => Size: 102400 bytes       DMA: NO         Time: 0.000830739 seconds      Rate: 120374 KB/s
~~~~~~~~~~~~~~~~~~~~~
it seemd edma have some bug for kernel dma subsystem on this platform, who can help me? thanks

I have found sample code in file pci-epf-dma-test.c,it used paie and dma too,but it didn’t use DMA api of kernel subsystem,it control registers directly! i have some issues when i read the source:
res = platform_get_resource_byname(pdev, IORESOURCE_MEM, “atu_dma”);
epfnv->dma_base = devm_ioremap(fdev, res->start + DMA_OFFSET,
resource_size(res) - DMA_OFFSET);
but i did’t find where is “atu_dma”. who can tell me where i get the dma_base value?

Sorry for the late response, have you managed to get issue resolved or still need the support? Thanks

thanks kayccc! I didn’t resoved it yet,I have edit the device tree by some other ticket:
//delete by duke 2023.11.21
//iommus = <&smmu_niso0 TEGRA_SID_NISO0_PCIE4>;
//iommu-map = <0x0 &smmu_niso0 TEGRA_SID_NISO0_PCIE4 0x1000>;
//dma-coherent;
//iommu-map-mask = <0x0>;
but when i run ,it report another error:
[ 2226.010321] pci_epf_test pci_epf_test.0: pci_epf_test_init_dma_chan dma channel id:20
[ 2433.727547] arm-smmu 12000000.iommu: Unexpected global fault, this could be serious
[ 2433.735506] arm-smmu 12000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00001013, GFSYNR2 0x00000000
[ 2433.748652] mc-err: vpr base=0:0, size=0, ctrl=1, override:(201803c6, b9ee11c1, 1, 0)
[ 2433.756761] mc-err: (255) csw_pcie4w: MC request violates VPR requirements
[ 2433.763889] mc-err: status = 0x0ff740e1; hi_addr_reg = 0x00000000 addr = 0xffffffff00
[ 2433.772130] mc-err: secure: yes, access-type: write

and i have change the config file(kernel-5.10/arch/arm64/configs/tegra_defconfig) add CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y,
report the same error!

where is support?I am lookford to it!

I have run example pcie_ep_epf_dma_test on this platform,kernel crush:
) [last unloaded: mtd]
[ 1903.689720] CPU: 0 PID: 2292 Comm: cat Tainted: G W OE 5.10.120-tegra #12
[ 1903.689721] Hardware name: Unknown NVIDIA Orin Nano Developer Kit/NVIDIA Orin Nano Developer Kit, BIOS 4.1-33958178 08/01/2023
[ 1903.689723] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=–)
[ 1903.689725] pc : tegra234_cbb_isr+0x130/0x170
[ 1903.689727] lr : tegra234_cbb_isr+0x10c/0x170
[ 1903.689729] sp : ffff800010003e10
[ 1903.689730] x29: ffff800010003e10 x28: ffff5f6d88161d80
[ 1903.689733] x27: 0000000000000001 x26: 0000000000000080
[ 1903.689736] x25: ffffdd1d4ec5a0f8 x24: ffffdd1d4f5aadd8
[ 1903.689738] x23: ffffdd1d4ef47008 x22: 0000000000000019
[ 1903.689741] x21: ffffdd1d4f3cf1e0 x20: 0000000000000002
[ 1903.689744] x19: ffffdd1d4f3cf1d0 x18: 0000000000000010
[ 1903.689746] x17: 0000000000000000 x16: ffffdd1d4d545220
[ 1903.689749] x15: ffff5f6d881622f0 x14: ffffffffffffffff
[ 1903.689752] x13: ffff800090003917 x12: ffff80001000391f
[ 1903.689754] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
[ 1903.689757] x9 : ffff800010003c30 x8 : 2a2a2a2a2a2a2a2a
[ 1903.689760] x7 : 2a2a2a2a2a2a2a09 x6 : c0000000ffffefff
[ 1903.689762] x5 : ffff5f6eee7b7958 x4 : ffffdd1d4f257a48
[ 1903.689765] x3 : 0000000000000000 x2 : ffffdd1d4d6e0d70
[ 1903.689769] x1 : ffff5f6d88161d80 x0 : 0000000100010000
[ 1903.689771] Call trace:
[ 1903.689775] tegra234_cbb_isr+0x130/0x170
[ 1903.689779] __handle_irq_event_percpu+0x68/0x2a0
[ 1903.689782] handle_irq_event_percpu+0x40/0xa0
[ 1903.689784] handle_irq_event+0x50/0xf0
[ 1903.689786] handle_fasteoi_irq+0xc0/0x170
[ 1903.689788] generic_handle_irq+0x40/0x60
[ 1903.689791] __handle_domain_irq+0x70/0xd0
[ 1903.689793] gic_handle_irq+0x68/0x134
[ 1903.689794] el1_irq+0xd0/0x180
[ 1903.689799] stress_test+0x1f0/0xb10 [pci_epf_dma_test]
[ 1903.689802] seq_read_iter+0x1d4/0x490
[ 1903.689804] seq_read+0xf4/0x150
[ 1903.689806] full_proxy_read+0x6c/0xa0
[ 1903.689808] vfs_read+0xb4/0x1c0
[ 1903.689810] ksys_read+0x7c/0x110
[ 1903.689812] __arm64_sys_read+0x28/0x40
[ 1903.689815] el0_svc_common.constprop.0+0x80/0x1d0
[ 1903.689817] do_el0_svc+0x38/0xb0
[ 1903.689819] el0_svc+0x1c/0x30
[ 1903.689821] el0_sync_handler+0xa8/0xb0
[ 1903.689823] el0_sync+0x16c/0x180
[ 1903.689824] —[ end trace 2c56e921df5b9794 ]—
[ 1903.694664] CPU:0, Error: cbb-fabric@0x13a00000, irq=25
[ 1903.700026] **************************************
[ 1903.704945] CPU:0, Error:cbb-fabric, Errmon:2
[ 1903.709421] Error Code : PWRDOWN_ERR
[ 1903.713443] Overflow : Multiple PWRDOWN_ERR

[ 1903.719632] Error Code : PWRDOWN_ERR
[ 1903.723653] MASTER_ID : CCPLEX
[ 1903.727141] Address : 0x36060308
[ 1903.730813] Cache : 0x1 – Bufferable
[ 1903.735104] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 1903.742072] Access_Type : Write
[ 1903.745656] Access_ID : 0x0
[ 1903.745658] Fabric : cbb-fabric
[ 1903.752462] Slave_Id : 0xb
[ 1903.755594] Burst_length : 0x0
[ 1903.759084] Burst_type : 0x1
[ 1903.762399] Beat_size : 0x2
[ 1903.765618] VQC : 0x0
[ 1903.768394] GRPSEC : 0x7e
[ 1903.771444] FALCONSEC : 0x0
[ 1903.774668] **************************************
[ 1903.779691] ------------[ cut here ]------------
[ 1903.779696] WARNING: CPU: 0 PID: 2292 at drivers/soc/tegra/cbb/tegra234-cbb.c:577 tegra234_cbb_isr+0x130/0x170
[ 1903.789975] Modules linked in: pci_epf_dma_test(E) fuse(E) nvidia_modeset(OE) lzo_rle(E) lzo_compress(E) zram(E) ramoops(E) reed_solomon(E) bnep(E) loop(E) aes_ce_blk(E) crypto_simd(E) cryptd(E) snd_soc_tegra186_asrc(E) aes_ce_cipher(E) rtk_btusb(E) snd_soc_tegra186_dspk(E) ghash_ce(E) snd_soc_tegra210_ope(E) snd_soc_tegra186_arad(E) snd_soc_tegra210_iqc(E) snd_soc_tegra210_mvc(E) sha2_ce(E) btusb(E) snd_soc_tegra210_adx(E) sha256_arm64(E) snd_soc_tegra210_afc(E) snd_soc_tegra210_dmic(E) snd_soc_tegra210_amx(E) rtl8822ce(E) snd_soc_tegra210_adsp(E) snd_soc_tegra210_admaif(E) btrtl(E) sha1_ce(E) snd_soc_tegra210_sfc(E) snd_soc_tegra210_mixer(E) snd_soc_tegra210_i2s(E) btbcm(E) snd_soc_tegra_pcm(E) snd_soc_tegra_machine_driver(E) snd_soc_tegra_utils(E) btintel(E) snd_soc_simple_card_utils(E) snd_soc_spdif_tx(E) snd_hda_codec_hdmi(E) nvadsp(E) fusb301(E) r8168(E) nv_imx219(E) cfg80211(E) snd_soc_tegra210_ahub(E) tegra_bpmp_thermal(E) userspace_alert(E) snd_hda_tegra(E) snd_hda_codec(E)
[ 1903.790024] tegra210_adma(E) snd_hda_core(E) nvidia(OE) spi_tegra114(E) binfmt_misc(E) ina3221(E) pwm_fan(E) nvgpu(E) nvmap(E) ip_tables(E) x_tables(E) [last unloaded: mtd]
[ 1903.790037] CPU: 0 PID: 2292 Comm: cat Tainted: G W OE 5.10.120-tegra #12
[ 1903.790038] Hardware name: Unknown NVIDIA Orin Nano Developer Kit/NVIDIA Orin Nano Developer Kit, BIOS 4.1-33958178 08/01/2023
[ 1903.790041] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=–)
[ 1903.790043] pc : tegra234_cbb_isr+0x130/0x170
[ 1903.790045] lr : tegra234_cbb_isr+0x10c/0x170
[ 1903.790046] sp : ffff800010003e10
[ 1903.790047] x29: ffff800010003e10 x28: ffff5f6d88161d80
[ 1903.790050] x27: 0000000000000001 x26: 0000000000000080
[ 1903.790053] x25: ffffdd1d4ec5a0f8 x24: ffffdd1d4f5aadd8
[ 1903.790056] x23: ffffdd1d4ef47008 x22: 0000000000000019
[ 1903.790058] x21: ffffdd1d4f3cf1e0 x20: 0000000000000002
[ 1903.790062] x19: ffffdd1d4f3cf1d0 x18: 0000000000000010
[ 1903.790065] x17: 0000000000000000 x16: ffffdd1d4d545220
[ 1903.790068] x15: ffff5f6d881622f0 x14: ffffffffffffffff
[ 1903.790070] x13: ffff800090003917 x12: ffff80001000391f
[ 1903.790073] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
[ 1903.790076] x9 : ffff800010003c30 x8 : 2a2a2a2a2a2a2a2a
[ 1903.790078] x7 : 2a2a2a2a2a2a2a09 x6 : c0000000ffffefff
[ 1903.790081] x5 : ffff5f6eee7b7958 x4 : ffffdd1d4f257a48
[ 1903.790084] x3 : 0000000000000000 x2 : ffffdd1d4d6e0d70
[ 1903.790087] x1 : ffff5f6d88161d80 x0 : 0000000100010000
[ 1903.790090] Call trace:
[ 1903.790093] tegra234_cbb_isr+0x130/0x170
[ 1903.790096] __handle_irq_event_percpu+0x68/0x2a0
[ 1903.790099] handle_irq_event_percpu+0x40/0xa0
[ 1903.790102] handle_irq_event+0x50/0xf0
[ 1903.790104] handle_fasteoi_irq+0xc0/0x170
[ 1903.790106] generic_handle_irq+0x40/0x60
[ 1903.790109] __handle_domain_irq+0x70/0xd0
[ 1903.790110] gic_handle_irq+0x68/0x134
[ 1903.790112] el1_irq+0xd0/0x180
[ 1903.790116] pcie_dma_epf_exit+0xa4/0x66c [pci_epf_dma_test]
[ 1903.790118] seq_read_iter+0x1d4/0x490
[ 1903.790120] seq_read+0xf4/0x150
[ 1903.790123] full_proxy_read+0x6c/0xa0
[ 1903.790125] vfs_read+0xb4/0x1c0
[ 1903.790127] ksys_read+0x7c/0x110
[ 1903.790128] __arm64_sys_read+0x28/0x40
[ 1903.790131] el0_svc_common.constprop.0+0x80/0x1d0
[ 1903.790133] do_el0_svc+0x38/0xb0
[ 1903.790136] el0_svc+0x1c/0x30
[ 1903.790137] el0_sync_handler+0xa8/0xb0
[ 1903.790139] el0_sync+0x16c/0x180
[ 1903.790140] —[ end trace 2c56e921df5b9795 ]—
[ 1908.980198] pcie_dma_epf tegra_pcie_dma_epf.0: edma_submit_direct_txrx: DD WR CH: 0 TO
[ 1908.988422] pcie_dma_epf tegra_pcie_dma_epf.0: stress_test: DD stress failed

Hi,
Would like to understand your setup. Do you connect Orin Nano to a x86 PC through PCIe? Through which PCIe slot on Orin Nano developer kit? Is Orin Nano in RP mode or EP mode?

nano run as EP mode and x86 run as a RP mode,I can run pcitest ok without dma(just memcpy), but if used DMA report iommu error!!