hello, I created a tensorflow model using TF-TRT such as:
your_outputs = [“out_soft_2/truediv:0”]
convert (optimize) frozen model to TensorRT model
converter = trt.TrtGraphConverter(
input_graph_def=frozen_graph,
nodes_blacklist=your_outputs, #output nodes
max_batch_size=10,
is_dynamic_op=True,
max_workspace_size_bytes=trt.DEFAULT_TRT_MAX_WORKSPACE_SIZE_BYTES,
precision_mode=trt.TrtPrecisionMode.FP32,
minimum_segment_size=1,
maximum_cached_engines=100)
trt_graph = converter.convert()
with open(“/home/user/tensor/test/phone001.trt.pb”, ‘wb’) as f:
f.write(trt_graph.SerializeToString())
And I use this tensorrt model to do the inference and compare with my tensorflow.pb model under the same size of input.
My input is variable (1, 64, ?, 3), and if I set the input size to (1, 64, 64, 3), the tensorrt inference speed was good, close to 2 times of tensorflow.pb speed.
but when I change the input size to (1, 64, 1024, 3), the tensorrt inference speed is only 0.9times of tensorflow.pb, even slower than tensorflow.pb
I want to know why such things happen, my model is a CRNN model for OCR application.
I understand that TF-TRT is almost useless for RNN , is that the reason/
Thanks for you help