TensorFlow to C++

Hi,
I’ve been working with CUDA for the last couple of year on desktops/servers.
I’ve looked for quite some time now, for a TensorFlow to C++ demo/tutorial - taking a tensor flow module and running the inference in C++ app, optimize it etc.
I’ve looked at the Hello AI demo, but it doesn’t show this as far as I could tell.
Any pointers?

Also, this is something else I’ve not fully understood. Once I have a trained net in TF, do I must convert it to UFF/ONNX and then somehow to nvidia’s tensor flow plan? why so many error-prone steps? Isn’t there something simpler to take a trained net in TF and run the inference with C++ TensorRT?

Hope this makes sense :)

thanks
Eyal

Hi,

These are two different frameworks: TensorFlow and TensorRT.

1.

It’s possible to inference TensorFlow in C++ interface without converting the model.
You will need a TensorFlow C++ library. Please check this topic for some information:

2.

However, we recommends to convert your model into TensorRT which is an optimizer for GPU-based inference.
The first step is to check if all the used operation are supported by TensorRT first:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-601/tensorrt-support-matrix/index.html#supported-ops

If yes, you will need to convert the TensorFlow into an intermediate model format (UFF/ONNX) as TensorRT input.

3.

There is also an alternative to convert the model into TensorRT within TensorFlow directly.
You can check this sample for information:

Thanks.

Hi,
I’ve looked a bit more and if I understand this correctly, option 3 that you mention is the direct TF-TRT path?
I.e. convert the graph to an optimized TensorRT plan file ready to run in a C++ application? The link you’ve put in option 3 seems not relevant?

Also, if I do manage to create a .plan file for TensorRT from within TensorFlow on, say, a desktop with a GTX card, isn’t there an issue to take to a different platform (for example Xaviar)? isn’t the conversion platform dependant?

thanks a lot
Eyal

Hi,

Suppose there is a corresponding function in the C++ interface although the tutorial is python-based.

The plan file is not portable.
TensorRT will choose an optimal algorithm based on the GPU architecture when creating the engine.
This limits you to use a plan file created from different platform.

However, uff file is portable.
It is an intermediate description for model that independent to the GPU architecture.

Thanks.

Hi,
Thanks again for the answers.
I guess something still missing for me.
Assuming the development work is NOT done on Xavier, how would I run an optimized plan in C++ on the Xavier itself?

If I understood correctly from what you say, I have only two options?

  • Develop, build TF net and save the TF output to TRT - ALL on the target machine - Xavier.
  • Develop on whatever platform I’d like (No Xavier) and via Onnx/Uff convert to Xavier.

Is that correct? If so, this is extremely cumbersome :(

thanks
Eyal

Hi,

In general, the workflow like this:

  1. Train your model on the host.
  2. Convert your model into .uff or .onnx
  3. Copy step.2 file to the device
  4. Create a TensorRT engine from the file.

Thanks.