Optimize Tensorflow with Tensor RT to improve inference timing


I have used a Jetson Nano to run a model for object detection, at the beginning I ran the inference system on my laptop and the timing I got for the prediction was pretty low (~1 sec) the thing is that I switched the model as it was to the Jetson Nano and I ran it to test (without any optimization)the timing I would get and for prediction it took about ~10 seconds.

can I get support in order to optimize the timing for my system? the details are:

Framework: Tensorflow.
Architecture: R FCN.
inference: ~10 Sec.

I suppose that we may use Tensor RT to improve efficiency and timing, however I would like to know if there is something already done that can be leveraged or if we can get the support to improve it.

also, my assumption is that the Jetson Nano has been developed to run ML algorithms, so if I can get literature and examples on how to take advantage of the hardware as much as it can be done I will be grateful.

Luis G.


We don’t have a sample for RFCN directly.
But there are still some tutorial for object detection model for your reference.

In general, there are two possible approach for TensorFlow based model: TF-TRT and pure TensoRT.

Pure TensorRT can give you best performance since we have optimized it for Jetson platform.
However, not all the TensorFlow operations are supported and this will require you to write a plugin layer on your own.

TF-TRT will automatically fallback the non-supported layer into TensorFlow, which can free user from implementing a plugin layer.
But we found TF-TRT doesn’t give a good performance on Jetson and also consume too much memory.

TF-TRT: https://github.com/NVIDIA-AI-IOT/tf_trt_models
TRT: https://github.com/AastaNV/TRT_object_detection