TensorFlow to C++

I’ve been working with CUDA for the last couple of year on desktops/servers.
I’ve looked for quite some time now, for a TensorFlow to C++ demo/tutorial - taking a tensor flow module and running the inference in C++ app, optimize it etc.
I’ve looked at the Hello AI demo, but it doesn’t show this as far as I could tell.
Any pointers?

Also, this is something else I’ve not fully understood. Once I have a trained net in TF, do I must convert it to UFF/ONNX and then somehow to nvidia’s tensor flow plan? why so many error-prone steps? Isn’t there something simpler to take a trained net in TF and run the inference with C++ TensorRT?

Hope this makes sense :)



These are two different frameworks: TensorFlow and TensorRT.


It’s possible to inference TensorFlow in C++ interface without converting the model.
You will need a TensorFlow C++ library. Please check this topic for some information:


However, we recommends to convert your model into TensorRT which is an optimizer for GPU-based inference.
The first step is to check if all the used operation are supported by TensorRT first:

If yes, you will need to convert the TensorFlow into an intermediate model format (UFF/ONNX) as TensorRT input.


There is also an alternative to convert the model into TensorRT within TensorFlow directly.
You can check this sample for information:


I’ve looked a bit more and if I understand this correctly, option 3 that you mention is the direct TF-TRT path?
I.e. convert the graph to an optimized TensorRT plan file ready to run in a C++ application? The link you’ve put in option 3 seems not relevant?

Also, if I do manage to create a .plan file for TensorRT from within TensorFlow on, say, a desktop with a GTX card, isn’t there an issue to take to a different platform (for example Xaviar)? isn’t the conversion platform dependant?

thanks a lot


Suppose there is a corresponding function in the C++ interface although the tutorial is python-based.

The plan file is not portable.
TensorRT will choose an optimal algorithm based on the GPU architecture when creating the engine.
This limits you to use a plan file created from different platform.

However, uff file is portable.
It is an intermediate description for model that independent to the GPU architecture.


Thanks again for the answers.
I guess something still missing for me.
Assuming the development work is NOT done on Xavier, how would I run an optimized plan in C++ on the Xavier itself?

If I understood correctly from what you say, I have only two options?

  • Develop, build TF net and save the TF output to TRT - ALL on the target machine - Xavier.
  • Develop on whatever platform I’d like (No Xavier) and via Onnx/Uff convert to Xavier.

Is that correct? If so, this is extremely cumbersome :(



In general, the workflow like this:

  1. Train your model on the host.
  2. Convert your model into .uff or .onnx
  3. Copy step.2 file to the device
  4. Create a TensorRT engine from the file.