Does the mellanox RNIC send hardware prefetch instructions to the memory region?

When sending one-sided reads to remote nodes, I found the throughput count in by memory channel is greater than client application. So my question is that the read amplification is due to some hardware prefetch instructions from RNIC?

Thanks!

this problem is not from prefetch but unaligned access

–end–