How to achieve maximun throughput in PCIe?


We are transferring video frames from Jetson AGX Xavier PCI Endpoint to Jetson AGX Xavier PCI Root complex. The throughput which we achieved is given below,

| PCI gen | Observed fps | Max Bandwidth | Observed Bandwidth |
| PCIe 1.0 | 2660 fps | 2 GB/s | 1.52 GB/s |
| PCIe 2.0 | 5000 fps | 4 GB/s | 2.86 GB/s |
| PCIe 3.0 | 9200 fps | 8 GB/s | 5.2 GB/s |
| PCIe 4.0 | 15600 fps | 16 GB/s | 8.92 GB/s |

We found the frame-rate at which the video frames are being received in Root complex using gstreamer application with sink as fakesink. We used 8 lanes to transfer the video frames. We calculated the throughput from the fps we have received. In the above table, the difference between max bandwidth and observed bandwidth in PCIe gen 3.0 and PCIe gen 4.0 is high. How can we improve the throughput?

You can try to use the CONFIG_PCIE_TEGRA_DW_DMA_TEST, see Enabling CONFIG_PCIE_TEGRA_DW_DMA_TEST jetson xavior - #10 by vidyas


Based on the CONFIG_PCIE_TEGRA_DW_DMA_TEST code only we have written the PCI driver. Still, we are achieving only the throughput given in the above table.
For PCI gen 1.0 and PCI gen 2.0 the throughput difference between max and observed throughput is not so high. But, for PCI gen 3.0 and PCI gen 4.0, nearly 50% of the max bandwidth is only achieved. Is there any way to improve the throughput and reach max throughput?

Have you tried to set the system in max performance?
See NVIDIA Jetson Linux Driver Package Software Features : Clock Frequency and Power Management | NVIDIA Docs

I’ve tried setting the system in max performance and tested, but didn’t observe any improvement in the throughput.