I use some network which requires a lot of CPU <-> GPU transfers. AFAIK, CPU and GPU have unified (common) memory, so using a zero-copy technique is possible.
I’ve read a topic https://devtalk.nvidia.com/default/topic/996820/jetson-tx1/zero-copy-for-tensorflow/ about TX1, in which the answer states that a modification to a tensorflow sources is needed.
This article http://arrayfire.com/zero-copy-on-tegra-k1/ gives a drection in what modifications should be done.
But I already have a specific implementation by nvidia included in JetPack (I use v4.2).
Does this version use zero-copy?
Is it planned to implement such an improvement in JetPack?
Is it possible to get sources to modify it enabling zero-copy?
Does anyone succeed in implementing zero-copy on Jetson AGX Xavier?
Thanks in advance for answers.