Hi,
Thanks for your response.
The backend that I’ve been using is Tensorflow, so we could say I use ‘Keras on Tensorflow’.
I can share the .pb file, but I use some custom objects (mainly loss, accuracy and optimizer) that are not available by default. I could share the code also if needed, tell me something.
The error I get is different whether if I try to convert it from keras to trt, or from keras to tensorflow to uff to trt.
If I try to convert the frozen graph to uff format using (convert_to_uff.py), the error I get is because of the Conv2DLSTM layer:
uff.model.exceptions.UffException: Const node conversion requested, but node is not Const
name: "conv_lst_m2d_1/while/BiasAdd_2/Enter"
op: "Enter"
input: "conv_lst_m2d_1/strided_slice_10"
attr {
key: "T"
value {
type: DT_FLOAT
}
}
attr {
key: "frame_name"
value {
s: "conv_lst_m2d_1/while/while_context"
}
}
attr {
key: "is_constant"
value {
b: true
}
}
attr {
key: "parallel_iterations"
value {
i: 32
}
}
I somehow managed to save it to trt using this code:
def freeze_graph(graph, session, output, save_pb_dir='.', save_pb_name='frozen_model.pb', save_pb_as_text=False):
with graph.as_default():
graphdef_inf = tf.graph_util.remove_training_nodes(graph.as_graph_def())
graphdef_frozen = tf.graph_util.convert_variables_to_constants(session, graphdef_inf, output)
graph_io.write_graph(graphdef_frozen, save_pb_dir, save_pb_name, as_text=save_pb_as_text)
return graphdef_frozen
session.run(init_op)
input_names = [t.op.name for t in model.inputs]
output_names = [t.op.name for t in model.outputs]
# Prints input and output nodes names, take notes of them.
print(input_names, output_names)
frozen_graph = freeze_graph(session.graph, session, [out.op.name for out in model.outputs], save_pb_dir=save_pb_dir)
trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_names,
max_batch_size=1,
max_workspace_size_bytes=1 << 25,
precision_mode='FP16',
minimum_segment_size=50
)
graph_io.write_graph(trt_graph, "./models/", "trt_graph.pb", as_text=False)
But when I load the new “trt_graph.pb” and try to predict with this:
def get_frozen_graph(graph_file):
"""Read Frozen Graph file from disk."""
with tf.gfile.FastGFile(graph_file, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
return graph_def
trt_graph = get_frozen_graph('./models/trt_graph.pb')
K.set_learning_phase(False)
tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
tf_sess = tf.Session(config=tf_config)
tf.import_graph_def(trt_graph, name='')
output_tensor = tf_sess.graph.get_tensor_by_name(output_tensor_name)
feed_dict = {
input_tensor_name: image_sample
}
predicted_mask = tf_sess.run(output_tensor, feed_dict)
The RAM memory usage raises to 3.7GB + 3GB of swap for a total of almost 7GB(!), the load time of the model is around 3 minutes, and the inference time is mantained at around 2 seconds per frame.
I don’t know if my model is simply to complex for Jetson Nano to be able to save it as uff (because of the Conv2DLSTM layers), but what I don’t understand at all is why after converting it to trt, the inference time is the same, but the RAM usage increases by 2 more GB.
Any kind of help will be appreciated, even if the answer is that the model is too complex.
Thanks!