Zero copy for Tensorflow

Hi,

Thanks for your question.

Zero-copy function needs to be specified when calling cudaMalloc, so modification is needed.
If you want to make your tensorflow support zero copy, you can follow this page:
http://arrayfire.com/zero-copy-on-tegra-k1/

More, I think if your framework support GPU input and then it’s possible to use zero copy.
For example:

  1. Prepared shared pointer and create model that uses this pointer as input
  2. Load image data to the shared pointer
  3. Inference from GPU input layer directly