Correct way of deploying a tensorflow model on TX2?

I am training an object detection model on my desktop and want to deploy this model on TX2.
I have installed tensorflow 1.2.1 on TX2, and tried to load the saved model (in .pb format) in python.
It turned out the runtime is 100 times slower than my desktop(GTX 1060 6GB) for detecting on image(I tried different architectures and hyperparamaters, it seems always around 100 times slower). According to the specs of TX2 compared with GTX 1060 (1.5 tflops V.s 4.4 tflops?), I think it shouldn’t be that slow. I am not quite sure if it is normal, or I was not running the model in an optimal way.

I noticed TensorRT seems the way to go, but the current version not support importing tesnorflow model according to the website( will support in version 3).

An other way is to use tensorflow serving( to serve the model.

Anyone has some experience of deploying tensorflow models on TX2? What’s the runtime performance compared to your desktop? And how did you deploy the model?

Any related information will be really helpful.

Did you utilize the accelerated MAX-P mode with


you may try tensorRT 2 which is announced within Jetpack 3.1

Thanks! I am a desktop guy and new to jetson. I switched to MAX-N mode, and it reduced runtime from 0.5 sec to 0.27 sec.

I will also check tensorRT2.1 as well.

Thank you again for your reply.

Tx2 may not a good one for doing deep learning training, but good for inference.

I had tried couple of framework for inference and get very good results on performance for DL inference including Mxnet, Caff2 and tensorflow (v1.2) , although TF loading take longer than others but there are some way like XLA and model optimization need to be done for embedded platform or mobile device.

Thanks for your input, I will try to look for the details you mentioned here.

Thanks for reply. I tried MAX-N mode and it is better.
I noticed tensorRT 2, is there any tutorials on how to load tensorflow model with tensorRT2? I tried to find something but failed.


Try these command to maximize TX2 performance:

sudo ~/
sudo nvpmodel -m 0

TensorFlow model parser is a new feature in TensorRT3.0.
Currently, TensorRT2.1 can only take caffemodel as input.


Had the same results on our TX2s. Had a few odd results with the different performance modes(may be an issue with using time as a benchmark), but overall a tensorflow model that is taking 7-9 seconds on a GTX 980 is taking around a minute on the TX2 at best.

Hoping that TensorRT3 will be able to cut this time down.

Anther point for our optimisation is the memory we are using. The read speed of eMMC is considerably slower than that of an SSD, so (some?) speed will be obtained here.


Do you need to use swap when inferencing? The swap will degrade inference performance.

No, I did not.
It turned out to be the tensorflow I installed was not compiled to support GPU well (although it shows GPU information at runtime). I have a newer version and it is way faster (only 3 time slower than GTX 1060 6GB). Thank you for your assist, and I will wait for tensorrt3 for a better peroformance.


Thanks for the feedback.
Here is a user sharing about how to build tensorflow on Tegra:

Thanks for information. You guys are very supportive!

Can you tell me how did you solve the problem?
which tensorflow version did you use?


If you want to build tensorflow from source, here is an excellent tutorial:

There are also some .whl available; please check this topic:

I was not able to compile with tutorial, but I got a pre-compiled wheel from here:

I was using 1.3 rc, and I just saw someone updated 1.3

Regarding TensorFlow Serving you might want to have a look at which contains a Dockerfile and .bazelrc to build a Docker image containing the latest TensorFlow Serving inc. TensorFlow Core.

The image is based on which installs the necessary prerequisites. explains the bigger picture of this project.