Using TF-TRT, I’m testing for deep learning performance improvements with Jetson TX2 and Jetson AGX Xavier.
I created a TF-TRT converter and applied it to BlitzNet(https://github.com/dvornikita/blitznet) network.
The TF-TRT converter operates in the following operation:
- Load the frozen graph
- import trt_graph
- save the graph to ‘SavedModel’
First, it was tested on a PC and applied by Jetson, but the performance between TX2 and Xavier is too extreme.
Common sense, Xavier is fast and TX2 is slow. but the TF-TRT converter result is the TX2 is faster.
TF-TRT Convert Result.
Jetson Xavier : about 25min
Jetson TX2 : about 8min
And… SavedModel(TF-TRT Graph) Restore Result
Jetson Xavier : about 10~11min
Jetson TX2 : about 14sec
Is this a normal situation?
I’m really confused.
And Jetosn System
- Jetson AGX Xavier (MAX-N)
JetPack 4.1.1DP, TensorFlow 1.12.0(TensorFlow for JetPack), CUDA 10, TensorRT 5
- Jetson TX2 (MAX-N)
JetPack 3.3, TensorFlow 1.9.0(TensorFlow for JetPack), CUDA 9, TensorRT 4
Have you maximized the device performance before testing?
Thanks for answer.
Yes. I tried it. but the result is same.
Xavier is too slow…
Could you enable the device placement in the TensorFlow first?
This will give you some hardware information and help us to give a further suggestion.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
I’m facing the same issue as well.
The TRT optimized model is like 2~3 times faster in jetson Tx2 when compared to Xavier .
For TX2- TRT Version-4.x.x
for Xavier - TRT Version 5.x.x
So I had to rebuild the TRT optimized Graph and had the issues as mentioned above.
We have been working with the TX2 and Xavier too and obtained similar results when running Tensorflow object detection with TRT optimized graphs.
Any pointers on this would be greatly appreciated.
It’s recommended to use pure TRT rather than TF-TRT.
Here is our sample for TensorRT with TF object detection API.
how to convert a tensorflow 2 SavedModel to TRT uff format? couldn’t find any solution yet