TensorRT does not reduce running time?

Hi All,

I tried to use @JerryJiaGit program to execute, but the result I got is as follows
https://github.com/JerryJiaGit/facenet_trt

Pre-trained models:20180402-114759.pb
#TRT
MTCNN_Detected_time: 0.12405449599998519
emb_array_time = 0.11907808000000841
#Orignal
MTCNN_Detected_time: 0.1231230719999985
emb_array_time = 0.09276342399999749

#Modify content
trt_graph = trt.create_inference_graph(input_graph_def=graph_def,
outputs=[‘embeddings:0’],
max_batch_size = 128,
max_workspace_size_bytes= 2 << 20, # 2GB mem assgined to TRT
precision_mode=“FP16”, # Precision “FP32”,“FP16” or “INT8”
minimum_segment_size=1
)
#trt_graph=trt.calib_graph_to_infer_graph(trt_graph)
tf.import_graph_def(trt_graph, input_map=input_map, name=’’)

TensorRT does not increase efficiency and spends a lot of time creating new tensor.
May I ask what might be the reason?

Thanks a lot for the help
Nick

Is it the same issue JerryJia mentioned in https://devtalk.nvidia.com/default/topic/1045679/ ?

Hi vickyy,

Thank you for your reply.
I have tried both methods and I have also done a comparison.

original network: 0.062 sec
tensorrt network FP16(frozen meta graph and checkpoint): 0.063 sec
tensorrt network FP16(SavedModel): 0.073 sec

My result is getting worse.

I confirmed that the pb model is correct.
I am running face feature capture on Nvidia Xavier, JerryJia is used GV100 with tensor cores.
My TensorRT version is 5.0.3

May I ask what might be the reason?

Thanks a lot for the help
Nick

Hi,

TF-TRT will fallback implementation into TensorFlow if a layer is not supported.
Would you mind to profile how many layers in your model is accelerated with TensorRT first?

If the ratio is too small, the overhead to switch frameworks may even decrease the performance.
Ex. TF -> TRT -> TF -> TRT -> TF -> TRT -> TF -> TRT -> TF

Thanks.

Hi vickyy,

Thank you for your reply.
Is there any tool or method that allows me to get the converted layer?
Or is the message displayed on the terminal at the time of conversion the conversion content?

Thanks a lot for the help
Nick

Hi,

Could you check if this configuration works?

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Thanks.