Throughput interference between host and BF2

I have two servers directly connected. On one server, I’m running a DPDK application on the host side and another DPDK application on the Bluefield 2 DPU. The other server runs a multi-flow client application that each flow downloads data from the first server (either from the host or DPU). The problem is that even though I can get the maximum throughput when I run the DPDK application only on the host, the aggregate throughput gets degraded when some flows download data from the BF2 DPU. The throughput degradation looks proportional to the throughput of BF2. Is there any hardware limitation in the throughput or performance interference when both host and DPU are used? My guess is that the aggregate throughput should be always the physical bandwidth.

You mean same time download from HOST and DPU? That would be expected, since ARM CPU is not strong as HOST CPU.