TensortRT has no effect on ssd_mobilenet_v1_fpn_coco model

aalizzwellwang · September 4, 2018, 2:33am

When I use the ssd_mobilenet_v1_fpn_coco model to use tensorRT to accelerate,It doesn’t work

retinanet mobile no tensorRT
Iteration: 0.430 sec
Iteration: 0.421 sec
Iteration: 0.420 sec
Iteration: 0.427 sec
Iteration: 0.439 sec
Iteration: 0.427 sec
Iteration: 0.411 sec
Iteration: 0.424 sec
Iteration: 0.432 sec
Iteration: 0.429 sec
Iteration: 0.413 sec
Iteration: 0.424 sec
Iteration: 0.424 sec
Iteration: 0.428 sec
Iteration: 0.427 sec
Iteration: 0.431 sec
Iteration: 0.417 sec
Iteration: 0.418 sec
tensorRT
0.505087852478
0.504916906357
0.501970052719
0.505352973938
0.494786024094
0.498456954956
0.504287004471
0.50328707695
0.507141113281
0.499255895615
0.487679004669
0.489063978195
0.492527008057
0.503779172897
0.514405965805

log:
retinanet v1
(‘config_path’, ‘./data/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03/pipeline.config’)
(‘checkpoint_path’, ‘./data/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03/model.ckpt’)
2018-09-03 09:24:44.137510: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero
2018-09-03 09:24:44.137784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.67GiB freeMemory: 4.45GiB
2018-09-03 09:24:44.137850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-09-03 09:24:47.792908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-03 09:24:47.793169: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-09-03 09:24:47.793277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-09-03 09:24:47.793573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2913 MB memory) → physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
Converted 333 variables to const ops.
2018-09-03 09:26:07.919932: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 0
2018-09-03 09:26:16.036617: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:383] MULTIPLE tensorrt candidate conversion: 4
2018-09-03 09:26:16.057791: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] subgraph conversion error for subgraph_index:0 due to: “Unimplemented: Require 4 dimensional input. Got 0 const6” SKIPPING…( 108 nodes)
2018-09-03 09:26:16.064689: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] subgraph conversion error for subgraph_index:1 due to: “Unimplemented: Require 4 dimensional input. Got 0 const6” SKIPPING…( 108 nodes)
2018-09-03 09:26:16.837392: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] subgraph conversion error for subgraph_index:2 due to: “Invalid argument: Output node ‘const6’ is weights not tensor” SKIPPING…( 612 nodes)
2018-09-03 09:26:16.842941: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] subgraph conversion error for subgraph_index:3 due to: “Unimplemented: Require 4 dimensional input. Got 1 Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/zeros_like_47” SKIPPING…( 181 nodes)
[‘boxes’, ‘classes’, ‘scores’]
2018-09-03 09:27:47.320719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-09-03 09:27:47.320894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-03 09:27:47.320933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-09-03 09:27:47.320967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-09-03 09:27:47.321106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2913

thanks !

AastaLLL · September 8, 2018, 8:07am

Hi,

Not all TensorFlow layers are supported by TensorRT.
If a layer is not supported, TensorFlow implementation will be used and may cause the conversion overhead.

Could you check each layer is executed with TensorFlow or TensorRT first?
Thanks.

Topic		Replies	Views
How to speed up ssd_mobilenet_v1_fpn with tensorRT? Jetson TX2	4	1224	December 5, 2019
After converting ssdMobilnet from the examples, the model is slower Jetson Xavier NX tensorrt	3	622	August 19, 2020
Convert tensorflow model Jetson TX2	3	1875	February 13, 2018
ssd-mobile-net-v3 to tensorRT TensorRT	3	2947	March 11, 2023
Fine tune ssd_mobilenet_v1 model can't convert Jetson Nano	19	2751	August 20, 2019
Using TF-TRT to convert MobileNet / SSDLite model gives errors Jetson TX2	2	2060	November 15, 2018
SSD Mobilenet v2 on Jetson Nano too slow Jetson Nano neural-network-framework	6	2729	February 9, 2021
Support for FasterRcnn and SSD_Mobilenets in python for tensorrt 5.0.2.6 TensorRT	3	1026	December 26, 2018
try to convert tensorflow model ssd_mobilenet_v1 to tensorRT TensorRT	2	741	March 2, 2020
I am trying to convert the ONNX SSD mobilnet v3 model into TensorRT Engine. I am getting the below error Jetson TX2 tensorrt , tensorflow	23	4273	February 17, 2022

TensortRT has no effect on ssd_mobilenet_v1_fpn_coco model

Related topics