Here is a quote from optimize data transfers with cude blog
The peak bandwidth between the device memory and the GPU is much higher (144 GB/s on the NVIDIA Tesla C2050, for example) than the peak bandwidth between host memory and device memory (8 GB/s on PCIe x16 Gen2).
What is a practical/achievable memory transfer rate for the Xavier processor?
TIA