If I have a Tensorflow model, I have two options to optimize to TensorRT-optimized model: (i) via TF-TRT, which is relatively easy and simple, and (ii) using TensorRT C++ API. From a same model, in a same GPU, will both methods, (i) and (ii), generate a same performance, e.i., same FPS result? Or there will be a different of the performance? Can you provide a benchmark result of them?
After starting to try TensorRT optimization and I personally found difficulties here and there, so, I decide to make a video tutorial here how we can optimize deep learning model obtained using Keras and Tensorflow. I also demonstrate to optimize YOLOv3. Hope it helps for those who begins trying to use TensorRT, and don’t encounter similar difficulties as I experienced before.
I didn’t found any official benchmark. But in “Deep LearningInference on PowerEdge R7425” by dell is a comparison of TensorRT-API and TF-TRT.
In my research i got simillar results, so i can confirm the section in this whitepaper.