Hi, I have trained my own model (using transfer learning with resnet50 as the base model) and generated the tensorRT inferences using TF-TRT versus TensorRT C++ API. I was expecting higher performance with the TRT C++ API implementation compared to TF-TRT but I got the opposite results. Could you please assist?. See the chart below:
GPU model: Tesla T4
TF-TRT5 Environment: Ubuntu 16.04.5 LTS | nvidia driver 410.72 | TensorRT 5.0 | TensorFlow 1.10| CUDA 10.0 | Imagenet | script inference.py | docker image nvcr.io/nvidia/tensorflow:18.10-py3
Native TRT5 Environment: Ubuntu 16.04.5 LTS | nvidia driver 410.79 | TensorRT 5.0.2 | CUDA 10.0 | script trtexec.cpp | nvcr.io/nvidia/tensorrt:18.11-py3