Cannot achieve optimization with Tensor RT for ANN models on Jetson Nano

I am trying to build a simple tensorflow model and run it on jetson nano using Tensor RT. I am using the following script in order to export the frozen graph and create tensor rt graph.

Convert Keras model to ConcreteFunction

full_model = tf.function(lambda x: model(x))
full_model = full_model.get_concrete_function(
x=tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype))

Get frozen ConcreteFunction

frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()

inspect the layers operations inside your frozen graph definition and see the name of its input and output tensors

layers = [op.name for op in frozen_func.graph.get_operations()]
print("-" * 50)
print("Frozen model layers: ")
for layer in layers:
print(layer)

print("-" * 50)
print("Frozen model inputs: ")
print(frozen_func.inputs)
print("Frozen model outputs: ")
print(frozen_func.outputs)

Save frozen graph from frozen ConcreteFunction to hard drive

serialize the frozen graph and its text representation to disk.

tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir="./frozen_models",
name=“simple_frozen_graph.pb”,
as_text=False)

#Optional
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir="./frozen_models",
name=“simple_frozen_graph.pbtxt”,
as_text=True)

frozen_graph ="./frozen_models/simple_frozen_graph.pb"

input_names = [‘x’]
output_names = [‘Identity’]

import google.protobuf.text_format

with open(’./frozen_models/simple_frozen_graph.pbtxt’, ‘r’) as f:
frozen_graph_gd = google.protobuf.text_format.Parse(f.read(), tf.compat.v1.GraphDef())

trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_names,
max_batch_size=10,
max_workspace_size_bytes=1 << 25,
precision_mode=‘FP16’,
)

I cannot see any time difference when executing the trt optimized model rather than the frozen_graph.


There are 5 ops of 3 different types in the graph that are not converted to TensorRT: Identity, NoOp, Placeholder, (For more information see Accelerating Inference In TF-TRT User Guide :: NVIDIA Deep Learning Frameworks Documentation).
2022-04-05 12:01:58.382227: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 0
2022-04-05 12:01:58.385966: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:822] Optimization results for grappler item: tf_graph


What concerns me is that Number of TensorRT candidate segments is 0. So this means that no optimization occurs?
Is it possible to achieve TRT optimization & acceleration considering that my network has these OPTs?

Hi,

You can find the supported operations for TF-TRT below.

If the improvement of TF-TRT is not good, you might try to convert the model to a standalone TensorRT engine.
Below is a sample of the TensorFlow v2 model for your reference:

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.