Issues with A5000 P2P transfer when Root Complex Splits TLPs into 64B/8B Packets

We have developed an SoC chip with a PCIe 5.0 root complex that supports peer-to-peer transfers. However, this RC splits TLPs transmitted from devices into 64-byte packets—for example, a 256-byte packet is split into four 64-byte packets, while packets smaller than 64 bytes are split into 8-byte packets.

When testing the P2P functionality of this SoC’s RC on a platform equipped with two NVIDIA’s A5000 cards using simplep2p tool, we encountered two critical issues:

  1. Extremely low bandwidth, approximately 0.5 Gb/s.

  2. Data verification errors.

In comparison, on an Intel platform, the RC does not split packets during peer-to-peer transfers. We would like to ask:
Is packet splitting allowed when performing P2P transfers with A5000 cards? If so, are there any specific constraints on the packet size?