Why is TensorRT faster than TensorFlow?


What is the exact technical reason why TensorRT is faster than TensorFlow or others?

Everywhere I see, I see reasons like “40% faster” etc. No technical reasons.

Does anyone have a link to a technical reason?


Request you to share the model, script, profiler, and performance output if not shared already so that we can help you better.

Alternatively, you can try running your model with trtexec command.

While measuring the model performance, make sure you consider the latency and throughput of the network inference, excluding the data pre and post-processing overhead.
Please refer to the below links for more details:



1 Like

Thank you, NVES!

But this is a more general question.

I am asking, why does TensorRT in general perform better than TensorFlow on the GPU? Like how exactly is TensorRT taking advantage of the hardware to perform better than other machine learning libraries.


We document some of what we do here, Please refer following.

Thank you.