DMA transfer error : pcieport 0000:00:01.0: AER: Multiple Uncorrected (Fatal) error received: id=0020

lspci -t
t186:/ # lspci -t
-[0000:00]—01.0-[01]----00.0

start msg:
[ 3.252175] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 3.254209] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
[ 3.254549] tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
[ 3.254557] pci_bus 0000:00: root bus resource [mem 0x50100000-0x57ffffff]
[ 3.254563] pci_bus 0000:00: root bus resource [mem 0x58000000-0x7fffffff pref]
[ 3.254569] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 3.254573] pci_bus 0000:00: root bus resource [io 0x1000-0xffff]
[ 3.254915] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 3.255452] PCIE: ASPM not enabled
[ 3.255490] pci 0000:00:01.0: BAR 8: assigned [mem 0x50400000-0x50ffffff]
[ 3.255498] pci 0000:01:00.0: BAR 1: assigned [mem 0x50800000-0x50ffffff]
[ 3.255508] pci 0000:01:00.0: BAR 0: assigned [mem 0x50400000-0x504fffff]
[ 3.255518] pci 0000:00:01.0: PCI bridge to [bus 01]
[ 3.255529] pci 0000:00:01.0: bridge window [mem 0x50400000-0x50ffffff]
[ 3.255679] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[ 3.255683] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[ 3.255897] tegra-pcie 10003000.pcie-controller: speed change : Gen-1 -> Gen-2

in my device driver, use pci_alloc_consisten alloc dma buffer, and set the dma_bus_addr to device side, device use this address to write data to this buffer,occur error:

pcieport 0000:00:01.0: AER: Multiple Uncorrected (Fatal) error received: id=0020

ctx->virt_addr = pci_alloc_consistent(pdev,DMA_SIZE,&ctx->dma_bus_addr);
if (NULL == ctx->virt_addr)
{
    ret = -ENOMEM;
	iounmap(ctx->mem_bar);
	iounmap(ctx->reg_bar);
	p_reg_base = NULL;
	p_mem_base = NULL;
    printk(KERN_ERR"pci_alloc_consistent fail...\n");
	goto free_map;
}


printk(KERN_ERR"alloc dma vir addr = 0x%08lx, dma bus addr = 0x%08lx\n",ctx->virt_addr,ctx->dma_bus_addr);

[ 504.023699] hello,welcome to module init…
[ 504.028040] fpga_video_probe…
[ 504.031419] Welink FPGA Video 0000:01:00.0: enabling device (0000 -> 0002)
[ 504.038370] device support 64 bits DMA
[ 504.042238] IO BAR barnum = 0 start = 0x50400000 end = 0x504fffff mem_len = 0x100000
[ 504.050098] MEM BAR barnum = 1 start = 0x50800000 end = 0x50ffffff mem_len = 0x800000
[ 504.059810] alloc dma vir addr = 0xffffff8001281000, dma bus addr = 0x80d00000
[ 504.067048] force ulval = 0xfc400000
[ 504.070668] dma buffer phys addr = 0xffffffc081281000
[ 504.077934] FPGA video init OK…

how can I fix it ?

Bus address 0x80d00000 looks fine to me.
Can you please provide complete AER error log?

[ 288.488818] pcieport 0000:00:01.0: AER: Uncorrected (Fatal) error received: id=0020
t186:/system/lib/modules # [ 288.497026] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0008(Receiver ID)
[ 288.510524] pcieport 0000:00:01.0: device [10de:10e5] error status/mask=00040000/00000000
[ 288.518941] pcieport 0000:00:01.0: [18] Malformed TLP (First)
[ 288.525799] pcieport 0000:00:01.0: TLP Header: 40000002 00010000 00000000 00000000
[ 289.544902] pcieport 0000:00:01.0: AER: Device recovery failed

I see a ‘Malformed TLP’ issue here.
Can you please check and make sure the TLP framing is correct?