convert a retrained ssd-inception-v2 tensorflow model to a TensorRT model.
conversion and inference done on TX2
training - laptop
I used the “ssd-inception-v2” model from tensorflow zoo, retrained it and now I wanted to convert to TRT.
The problem was that the converted model was not faster than the origin since the input dim is undefined (?, ?, ?, 3).
I tried to set the input size to a fixed size using the code below, but I get an error, as shown below.
Help is appreciated in what and how to change.
The error:
ValueError: node 'image_tensor' in input_map does not exist in graph (input_map entry: image_tensor:0->image_tensor:0)
The code:
with tf.gfile.GFile(frozen_graph_filename, "rb") as file_handle:
graph_def = tf.GraphDef()
graph_def.ParseFromString(file_handle.read())
new_input = tf.placeholder(dtype=tf.uint8, shape=[1, 320, 320, 3], name='image_tensor')
with tf.Graph().as_default() as frozen_graph:
# tf.import_graph_def(graph_def, name='') # <-- this works as expected
tf.import_graph_def(graph_def, input_map={'image_tensor:0': new_input})
# convert to TRT:
model_out = ['detection_classes', 'num_detections', 'detection_boxes', 'detection_scores']
trt_graph = trt.create_inference_graph(
input_graph_def=graph_def,
outputs=model_out,
max_batch_size=1,
max_workspace_size_bytes=max_work_space,
precision_mode=tensorRT_precision,
is_dynamic_op=False)
Could you please share your script and model file so we can better help?
Also, provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow and PyTorch version
o TensorRT version
The model is, as mentioned, ssd-mobilenetv2 (for object detection), as taken from the tensorflow zoo.
Retraining follows the tensorflow examples, and the conversion to tensorRT is as above.
with tf.gfile.GFile(frozen_graph_filename, "rb") as file_handle:
graph_def = tf.GraphDef()
graph_def.ParseFromString(file_handle.read())
new_input = tf.placeholder(dtype=tf.uint8, shape=[None, 320, 320, 3], name='image_tensor')
tf.import_graph_def(graph_def, input_map={'image_tensor:0': new_input})
I get:
ValueError: NodeDef mentions attr 'shape' not in Op<name=Cast; signature=x:SrcT -> y:DstT; attr=SrcT:type; attr=DstT:type; attr=Truncate:bool,default=false>; NodeDef: {{node import/Cast}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
This is running on the laptop, same tf version (1.14.0)
Could you please try below code, it seems to be working:
frozen_graph_filename = "frozen_inference_graph.pb"
with tf.gfile.GFile(frozen_graph_filename, "rb") as file_handle:
graph_def = tf.GraphDef()
graph_def.ParseFromString(file_handle.read())
new_input = tf.placeholder(dtype=tf.uint8, shape=[1, 320, 320, 3], name='image_tensor')
#with tf.Graph().as_default() as frozen_graph: <--- commented this code
# tf.import_graph_def(graph_def, name='') # <-- this works as expected
tf.import_graph_def(graph_def, input_map={'image_tensor:0': new_input})
Updated Input Layer:
name: "image_tensor"
op: "Placeholder"
attr {
key: "dtype"
value {
type: DT_UINT8
}
}
attr {
key: "shape"
value {
shape {
dim {
size: 1
}
dim {
size: 320
}
dim {
size: 320
}
dim {
size: 3
}
}
}
}
Was able to successfully generate TRT engine with default max_workspace_size_bytes and precision_mode:
TensorRT model is successfully stored!
numb. of trt_engine_nodes in TensorRT graph: 4
numb. of all_nodes in TensorRT graph: 6023
Thanks for the support and for the effort but that didn’t work either (comment #4), but I was able to export the model to a frozen graph with the input dim set in another way. Inference seems to be as expected.
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'shape' not in Op<name=Cast; signature=x:SrcT -> y:DstT; attr=SrcT:type; attr=DstT:type; attr=Truncate:bool,default=false>; NodeDef: {{node Cast}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.)
Does this indicate a mismatch between tensorflow versions on the laptop vs. TX2 or could it be something else too?