Converting TF Model to TensorRT Engine

import keras
import keras.backend as K
import tensorflow as tf
import uff

output_names = [‘predictions/Softmax’]
frozen_graph_filename = ‘frozen_inference_graph.pb’
sess = K.get_session()

freeze graph and remove training nodes

graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, output_names)
graph_def = tf.graph_util.remove_training_nodes(graph_def)

write frozen graph to file

with open(frozen_graph_filename, ‘wb’) as f:
f.write(graph_def.SerializeToString())
f.close()

convert frozen graph to uff

uff_model = uff.from_tensorflow_frozen_model(frozen_graph_filename, output_names)
G_LOGGER = trt.infer.ConsoleLogger(trt.infer.LogSeverity.ERROR)
parser = uffparser.create_uff_parser()
parser.register_input(“Placeholder”, (1,28,28), 0)
parser.register_output(“fc2/Relu”)
engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, 1, 1 << 20)
parser.destroy()
runtime = trt.infer.create_infer_runtime(G_LOGGER)
context = engine.create_execution_context()
output = np.empty(10, dtype = np.float32)

Alocate device memory

d_input = cuda.mem_alloc(1 * img.nbytes)
d_output = cuda.mem_alloc(1 * output.nbytes)

bindings = [int(d_input), int(d_output)]
stream = cuda.Stream()

Transfer input data to device

cuda.memcpy_htod_async(d_input, img, stream)

Execute model

context.enqueue(1, bindings, stream.handle, None)

Transfer predictions back

cuda.memcpy_dtoh_async(output, d_output, stream)

Syncronize threads

stream.synchronize()
print(“Test Case: " + str(label))
print (“Prediction: " + str(np.argmax(output)))
trt.utils.write_engine_to_file(”./tf_mnist.engine”, engine.serialize())

list of pakages
Linux:16
Cuda:9.0
tensorRt:4
Python 3.5
Gpu:Gtx 1080
Tensorflow:1.10.1

Error

Traceback (most recent call last):
File “bbb.py”, line 11, in
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, output_names)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/graph_util_impl.py”, line 232, in convert_variables_to_constants
inference_graph = extract_sub_graph(input_graph_def, output_node_names)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/graph_util_impl.py”, line 174, in extract_sub_graph
_assert_nodes_are_present(name_to_node, dest_nodes)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/graph_util_impl.py”, line 133, in _assert_nodes_are_present
assert d in name_to_node, “%s is not in graph” % d
AssertionError: predictions/Softmax is not in graph

OR please suggest me any code which convert tensorflow frozen graph(frozen_inference_graph.pb) to trt engine for object detection task

Hello,

It looks like you are using TF 1.10.1. Unfortunately, TRT 4 doesn’t support TF > 1.10. Please use TF 1.9 to generate a frozen graph and try again?

Hi @NVES I tried with tf 1.9 but it doesn’t work
Please can you suggest me any script which convert frozen graph to trt_engine
list of pakages
Linux:16
Cuda:9.0
tensorRt:4
Python 3.5
Gpu:Gtx 1080
Tensorflow:1.9

To convert tf frozen graph to trt_engine, please reference:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/workflows/tf_to_tensorrt.html#Converting-the-TensorFlow-Model-to-UFF

@ NVES uff_model = uff.from_tensorflow(tf_model, [“fc2/Relu”])

In this line [“fc2/Relu”] what does it mean

@ NVES what Does I define for this parameter in my case

@NVES, there is no code in that documentation for converting 3rd party models to TRT format. I can not find any on the internet ether. The best I cam accross is this video (2nd half) “How to accelerate your neural net inference with TensorRT”, Dmitry Korobchenko - YouTube which has a corresponding github repo https://github.com/dkorobchenko-nv/tensorrt-demo but the code is still not straight forward and possibly out of date.

@NVES hELP!