Loading Optimized saved_model with c++

Hi,

so I was able to convert a model with tensorrt adn stored it locally:

converter = trt.TrtGraphConverterV2(input_saved_model_dir=saved_model_compiled, conversion_params=conversion_params)
converter.convert()
converter.build(input_fn=my_input_fn)
converter.save(output_saved_model_dir=output_saved_model_dir)

and I was able to load it back and infer with python. Though I’d like to do the loading and inference with C++. Is there an example on how to implement the following python lines?

saved_model_loaded = tf.saved_model.load(output_saved_model_dir, tags=[tag_constants.SERVING])
graph_func = saved_model_loaded.signatures[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
frozen_func = convert_to_constants.convert_variables_to_constants_v2(graph_func)
x = tf.convert_to_tensor(np.random.normal(size=(infer_size,7,7)).astype(np.float32))
output = frozen_func(x)[0].numpy()

Hi,

This is the TensorFlow sample.
The C++ version is identical to the C++ usage of TensorFlow.

But since you have applied the TensorRT acceleration.
It’s recommended to convert the model into pure TensorRT since it is better optimized for Jetson.

Below is an example for your reference:

Thanks.

@AastaLLL hi,

So the TrtGraphConverterV2 converts a saved_model format to some ‘optimized’ format that needs to be converted again to UFF format and only then loaded by C++ code?

please refer to Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

Q: When will TensorRT support layer XYZ required by my network in the UFF parser?
A: UFF is deprecated. We recommend users switch their workflows to ONNX. The TensorRT ONNX parser is an open source project.

Hi,

These are two different methods: TF-TRT and pure TensorRT.

In TF-TRF, there is an option to apply the TensorRT optimization.
The C++ and python interfaces should be very similar.

In pure TensorRT, you will need to convert the model into uff (v1.15.x) or ONNX (v2.x).
And then feed it into TensorRT to generate the engine.

Since TensorRT does the optimization based on the hardware information.
The engine (both TF-TRT and pure TensorRT) is strongly hardware-dependent and cannot use cross-platform.

Thanks.