Where can I find a good documentation which discusses the differences between TensorRT and TensorFlow-TRT?
What will be faster TensorRT or TensorFlow-TRT (with a fully compatible graph) and Why?
Is there code available which shows the whole pipeline of functioning: From a TensorFlow model (Google’s Zoo) to the inference in the devboard (with TensorRT)
In general, pure TensorRT API can give you a better performance.
TF-TRT integrates TensorRT into the TensorFlow interface so you will need to create two implementation to enable fallback sometimes.
This mechanism will affect memory usage and performance but allowing user to use TensorRT easily.