CBB & BadTLP error when running DMA transfer with PCIe device (C5 port) - works fine on Xavier

I am porting Orin onto an existing design with Xavier AGX.
The design connects via PCIe (C5 x8 gen3) to an FPGA on a custom baseboard. The FPGA uses DMA transfers (Intel DMA engine) to transfer data to and from the Jetson (using GPUdirect RDMA).
The design work fine and stable on Xavier, but with Orin it crashes within the first 0 to 30s with BadTLP from PCIe and CBB errors.
Any inputs are welcome.

[ 57.157712] gxyfpga 0005:01:00.0: Starting mSGDMA(0)
[ 57.157819] gxyfpga 0005:01:00.0: Starting mSGDMA(1)
[ 73.372994] pcieport 0005:00:00.0: AER: Multiple Corrected error received: 0005:00:00.0
[ 73.503464] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 73.503754] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000041/0000e000
[ 73.503993] pcieport 0005:00:00.0: [ 0] RxErr (First)
[ 73.504189] pcieport 0005:00:00.0: [ 6] BadTLP
[ 73.504380] pcieport 0005:00:00.0: AER: Multiple Corrected error received: 0005:00:00.0
[ 73.656260] CPU:0, Error:CBB-EN@0x13a00000,irq=21
[ 73.656391] **************************************
[ 73.656525] * For more Internal Decode Help
[ 73.656640] * http://nv/cbberr
[ 73.656735] * NVIDIA userID is required to access
[ 73.656867] **************************************
[ 73.657001] CPU:0, Error:CBB-EN, Errmon:2
[ 73.657121] Error Code : TIMEOUT_ERR
[ 73.657232] Overflow : Multiple TIMEOUT_ERR
[ 73.657365] First logged Err Code : TIMEOUT_ERR
[ 73.657509] MASTER_ID : CCPLEX
[ 73.657606] Address : 0x3a000814
[ 73.657707] Cache : 0x0 – Device Non-Bufferable
[ 73.657849] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 73.658036] Access_Type : Read
[ 73.658037] Fabric : CBB
[ 73.658198] Slave_Id : 0x16
[ 73.658407] Burst_length : 0x0
[ 73.658901] Burst_type : 0x1
[ 73.659397] Beat_size : 0x2
[ 73.659857] VQC : 0x0
[ 73.660280] GRPSEC : 0x7e
[ 73.660746] FALCONSEC : 0x0
[ 73.661208] CBB_SN_PCIE_C5_SLV_TIMEOUT_STATUS : 0x1
[ 73.661971] **************************************
[ 73.666367] ------------[ cut here ]------------
[ 73.666378] WARNING: CPU: 0 PID: 191 at /home/mandre/nvidia/nvidia_sdk/JetPack_5.0.1_DP_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/sources/kernel/nvidia/drivers/platform/tegra/cbb/tegra23x_cbb.c:541 tegra234_cbb_error_isr+0x150/0x1c8
[ 73.687000] Modules linked in: galaxy_fpga(O) nvidia_p2p fuse lzo_rle lzo_compress zram loop aes_ce_blk crypto_simd cryptd nvgpu aes_ce_cipher snd_soc_tegra186_asrc ghash_ce snd_soc_tegra210_ope snd_soc_tegra210_iqc snd_soc_tegra186_dspk sha2_ce snd_soc_tegra186_arad snd_hda_codec_hdmi snd_soc_tegra210_mvc ofpart sha256_arm64 snd_soc_tegra210_admaif snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_adsp snd_soc_tegra210_adx snd_soc_tegra_machine_driver cmdlinepart sha1_ce snd_soc_tegra210_amx snd_hda_tegra snd_soc_tegra210_mixer snd_soc_tegra_pcm snd_soc_tegra210_i2s snd_soc_tegra210_sfc snd_soc_tegra_utils nvadsp snd_hda_codec qspi_mtd snd_soc_spdif_tx pwm_fan snd_soc_simple_card_utils snd_soc_tegra210_ahub tegra_bpmp_thermal snd_hda_core mtd nct1008 tegra210_adma spi_tegra114 binfmt_misc ina3221 ip_tables x_tables
[ 73.687067] CPU: 0 PID: 191 Comm: irq/50-aerdrv Tainted: G O 5.10.65-tegra #12
[ 73.687068] Hardware name: /, BIOS r34.1-975eef6 05/16/2022
[ 73.687070] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=–)
[ 73.687072] pc : tegra234_cbb_error_isr+0x150/0x1c8
[ 73.687073] lr : tegra234_cbb_error_isr+0xd0/0x1c8
[ 73.687074] sp : ffff800010003ba0
[ 73.687075] x29: ffff800010003ba0 x28: 0000000000000002
[ 73.687078] x27: 0000000000000001 x26: 0000000000000080
[ 73.687080] x25: 0000000000000001 x24: ffffcbccfdbb6000
[ 73.687082] x23: ffffcbccfd96b880 x22: ffffcbccfe2aa000
[ 73.687084] x21: 0000000000000015 x20: ffffcbccfe12e000
[ 73.687086] x19: ffffcbccfe12e240 x18: ffffffffffffffff
[ 73.687088] x17: 0000000000000000 x16: ffffcbccfd398278
[ 73.687089] x15: ffffcbccfde98908 x14: ffff8000900036d7
[ 73.687091] x13: ffff8000100036e5 x12: ffffffffffffffff
[ 73.687093] x11: 0000000005f5e0ff x10: ffff800010003630
[ 73.687095] x9 : 00000000ffffffd0 x8 : 2a2a2a2a2a2a2a2a
[ 73.687097] x7 : ffffcbccfdf0f898 x6 : c0000000ffffefff
[ 73.687098] x5 : ffff36ca7fdcc958 x4 : ffffcbccfdeb7898
[ 73.687100] x3 : 0000000000000001 x2 : ffffcbccfc3347e8
[ 73.687102] x1 : ffff36c306030e80 x0 : 0000000000010101
[ 73.687105] Call trace:
[ 73.687106] tegra234_cbb_error_isr+0x150/0x1c8
[ 73.687112] __handle_irq_event_percpu+0x60/0x280
[ 73.687114] handle_irq_event_percpu+0x40/0x98
[ 73.687115] handle_irq_event+0x4c/0xe8
[ 73.687117] handle_fasteoi_irq+0xb4/0x158
[ 73.687119] generic_handle_irq+0x3c/0x58
[ 73.687121] __handle_domain_irq+0x68/0xc0
[ 73.687123] gic_handle_irq+0x64/0x130
[ 73.687125] el1_irq+0xcc/0x180
[ 73.687126] __do_softirq+0xa0/0x3d4
[ 73.687129] irq_exit+0xd8/0xe0
[ 73.687131] __handle_domain_irq+0x6c/0xc0
[ 73.687132] gic_handle_irq+0x64/0x130
[ 73.687133] el1_irq+0xcc/0x180
[ 73.687137] _raw_spin_unlock_irqrestore+0x60/0x68
[ 73.687142] pci_bus_read_config_dword+0xa8/0xf0
[ 73.687144] pci_read_config_dword+0x44/0x70
[ 73.687147] find_device_iter+0x16c/0x178
[ 73.687148] pci_walk_bus+0x60/0xb8
[ 73.687150] find_source_device+0x54/0x78
[ 73.687152] aer_isr+0x198/0x4c0
[ 73.687154] irq_thread_fn+0x30/0xa0
[ 73.687155] irq_thread+0x164/0x250
[ 73.687158] kthread+0x158/0x160
[ 73.687159] ret_from_fork+0x10/0x18
[ 73.687160] —[ end trace 6c00692d19ae4ab4 ]—
[ 73.691773] CPU:0, Error:CBB-EN@0x13a00000,irq=21
[ 73.696449] **************************************
[ 73.701437] * For more Internal Decode Help
[ 73.705636] * http://nv/cbberr
[ 73.709048] * NVIDIA userID is required to access
[ 73.713599] **************************************
[ 73.718411] CPU:0, Error:CBB-EN, Errmon:2
[ 73.722439] Error Code : TIMEOUT_ERR
[ 73.726375] Overflow : Multiple TIMEOUT_ERR
[ 73.731015] First logged Err Code : TIMEOUT_ERR
[ 73.735911] MASTER_ID : CCPLEX
[ 73.739323] Address : 0x3a080090
[ 73.743000] Cache : 0x1 – Bufferable
[ 73.747198] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 73.753848] Access_Type : Read
[ 73.753850] Fabric : CBB
[ 73.760061] Slave_Id : 0x16
[ 73.763212] Burst_length : 0x0
[ 73.766624] Burst_type : 0x1
[ 73.769861] Beat_size : 0x2
[ 73.773011] VQC : 0x0
[ 73.775898] GRPSEC : 0x7e
[ 73.779049] FALCONSEC : 0x0
[ 73.782199] CBB_SN_PCIE_C5_SLV_TIMEOUT_STATUS : 0x1
[ 73.787274] **************************************
[ 73.792096] ------------[ cut here ]------------
[ 73.792101] WARNING: CPU: 0 PID: 191 at /home/mandre/nvidia/nvidia_sdk/JetPack_5.0.1_DP_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/sources/kernel/nvidia/drivers/platform/tegra/cbb/tegra23x_cbb.c:541 tegra234_cbb_error_isr+0x150/0x1c8

Is there any change (from stock SW) in the SW configuration between Xavier and Orin? particular w.r.t disabling SMMU Etc??
Also, have you tried the same BSP version + your SW stack on both Xavier and Orin?

Orin runs on 5.0.1_LP in 40W mode. Some changes to device tree with overlay. Most relevant changes are disabling all PCIe except for C5 (as RP). Disable MGBE, run HDMI instead of DP. No changes to SMMU.
Xavier runs on 4.6.1. Upgrade to 5.0.1_LP takes quite some time because the DT is managed more manually. No changes to SMMU.

I have now tried with 5.0.1_LP on Xavier. That works fine. Thus the problem only exists on Orin.