Hi,
Just test your model with trtexec, TensorRT can inference it without issue.
Would you mind to double check it?
$ /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx
[09/02/2020-15:15:43] [I] === Model Options ===
[09/02/2020-15:15:43] [I] Format: ONNX
[09/02/2020-15:15:43] [I] Model: resnet50_ssd_onnx1.5.onnx
[09/02/2020-15:15:43] [I] Output:
[09/02/2020-15:15:43] [I] === Build Options ===
[09/02/2020-15:15:43] [I] Max batch: 1
[09/02/2020-15:15:43] [I] Workspace: 16 MB
[09/02/2020-15:15:43] [I] minTiming: 1
[09/02/2020-15:15:43] [I] avgTiming: 8
[09/02/2020-15:15:43] [I] Precision: FP32
[09/02/2020-15:15:43] [I] Calibration:
[09/02/2020-15:15:43] [I] Safe mode: Disabled
[09/02/2020-15:15:43] [I] Save engine:
[09/02/2020-15:15:43] [I] Load engine:
[09/02/2020-15:15:43] [I] Builder Cache: Enabled
[09/02/2020-15:15:43] [I] NVTX verbosity: 0
[09/02/2020-15:15:43] [I] Inputs format: fp32:CHW
[09/02/2020-15:15:43] [I] Outputs format: fp32:CHW
[09/02/2020-15:15:43] [I] Input build shapes: model
[09/02/2020-15:15:43] [I] Input calibration shapes: model
[09/02/2020-15:15:43] [I] === System Options ===
[09/02/2020-15:15:43] [I] Device: 0
[09/02/2020-15:15:43] [I] DLACore:
[09/02/2020-15:15:43] [I] Plugins:
[09/02/2020-15:15:43] [I] === Inference Options ===
[09/02/2020-15:15:43] [I] Batch: 1
[09/02/2020-15:15:43] [I] Input inference shapes: model
[09/02/2020-15:15:43] [I] Iterations: 10
[09/02/2020-15:15:43] [I] Duration: 3s (+ 200ms warm up)
[09/02/2020-15:15:43] [I] Sleep time: 0ms
[09/02/2020-15:15:43] [I] Streams: 1
[09/02/2020-15:15:43] [I] ExposeDMA: Disabled
[09/02/2020-15:15:43] [I] Spin-wait: Disabled
[09/02/2020-15:15:43] [I] Multithreading: Disabled
[09/02/2020-15:15:43] [I] CUDA Graph: Disabled
[09/02/2020-15:15:43] [I] Skip inference: Disabled
[09/02/2020-15:15:43] [I] Inputs:
[09/02/2020-15:15:43] [I] === Reporting Options ===
[09/02/2020-15:15:43] [I] Verbose: Disabled
[09/02/2020-15:15:43] [I] Averages: 10 inferences
[09/02/2020-15:15:43] [I] Percentile: 99
[09/02/2020-15:15:43] [I] Dump output: Disabled
[09/02/2020-15:15:43] [I] Profile: Disabled
[09/02/2020-15:15:43] [I] Export timing to JSON file:
[09/02/2020-15:15:43] [I] Export output to JSON file:
[09/02/2020-15:15:43] [I] Export profile to JSON file:
[09/02/2020-15:15:43] [I]
----------------------------------------------------------------
Input filename: resnet50_ssd_onnx1.5.onnx
ONNX IR version: 0.0.6
Opset version: 9
Producer name: pytorch
Producer version: 1.6
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[09/02/2020-15:15:44] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[09/02/2020-15:15:44] [I] [TRT]
[09/02/2020-15:15:44] [I] [TRT] --------------- Layers running on DLA:
[09/02/2020-15:15:44] [I] [TRT]
[09/02/2020-15:15:44] [I] [TRT] --------------- Layers running on GPU:
[09/02/2020-15:15:44] [I] [TRT] Conv_0 + Relu_2, MaxPool_3, Conv_4 + Relu_6, Conv_7 + Relu_9, Conv_10, Conv_12 + Add_14 + Relu_15, Conv_16 + Relu_18, Conv_19 + Relu_21, Conv_22 + Add_24 + Relu_25, Conv_26 + Relu_28, Conv_29 + Relu_31, Conv_32 + Add_34 + Relu_35, Conv_36 + Relu_38, Conv_39 + Relu_41, Conv_42, Conv_44 + Add_46 + Relu_47, Conv_48 + Relu_50, Conv_51 + Relu_53, Conv_54 + Add_56 + Relu_57, Conv_58 + Relu_60, Conv_61 + Relu_63, Conv_64 + Add_66 + Relu_67, Conv_68 + Relu_70, Conv_71 + Relu_73, Conv_74 + Add_76 + Relu_77, Conv_78 + Relu_80, Conv_81 + Relu_83, Conv_84, Conv_86 + Add_88 + Relu_89, Conv_90 + Relu_92, Conv_93 + Relu_95, Conv_96 + Add_98 + Relu_99, Conv_100 + Relu_102, Conv_103 + Relu_105, Conv_106 + Add_108 + Relu_109, Conv_110 + Relu_112, Conv_113 + Relu_115, Conv_116 + Add_118 + Relu_119, Conv_120 + Relu_122, Conv_123 + Relu_125, Conv_126 + Add_128 + Relu_129, Conv_130 + Relu_132, Conv_133 + Relu_135, Conv_136 + Add_138 + Relu_139, Conv_170 || Conv_177, Reshape_183, Reshape_176, Conv_140 + Relu_142, Conv_143 + Relu_145, Conv_184 || Conv_191, Reshape_197, Reshape_190, Conv_146 + Relu_148, Conv_149 + Relu_151, Conv_198 || Conv_205, Reshape_211, Reshape_204, Conv_152 + Relu_154, Conv_155 + Relu_157, Conv_212 || Conv_219, Reshape_225, Reshape_218, Conv_158 + Relu_160, Conv_161 + Relu_163, Conv_226 || Conv_233, Reshape_239, Reshape_232, Conv_164 + Relu_166, Conv_167 + Relu_169, Conv_240 || Conv_247, Reshape_253, 534 copy, 556 copy, 578 copy, 600 copy, 622 copy, 644 copy, Reshape_246, 523 copy, 545 copy, 567 copy, 589 copy, 611 copy, 633 copy,
/usr/src/tensorrt/bin/trtexec/usr/src/tensorrt/bin/trtexec[09/02/2020-15:15:54] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[09/02/2020-15:17:44] [I] [TRT] Detected 1 inputs and 14 output network tensors.
[09/02/2020-15:17:44] [I] Starting inference threads
[09/02/2020-15:17:48] [I] Warmup completed 4 queries over 200 ms
[09/02/2020-15:17:48] [I] Timing trace has 56 queries over 3.09555 s
[09/02/2020-15:17:48] [I] Trace averages of 10 runs:
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0649 ms - Host latency: 55.2661 ms (end to end 55.2778 ms, enqueue 2.28474 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0191 ms - Host latency: 55.2195 ms (end to end 55.2305 ms, enqueue 2.17393 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0887 ms - Host latency: 55.2896 ms (end to end 55.2999 ms, enqueue 2.37067 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0917 ms - Host latency: 55.2921 ms (end to end 55.3034 ms, enqueue 2.31514 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0704 ms - Host latency: 55.2717 ms (end to end 55.2831 ms, enqueue 2.17344 ms)
[09/02/2020-15:17:48] [I] Host Latency
[09/02/2020-15:17:48] [I] min: 55.1567 ms (end to end 55.168 ms)
[09/02/2020-15:17:48] [I] max: 55.5354 ms (end to end 55.5509 ms)
[09/02/2020-15:17:48] [I] mean: 55.2665 ms (end to end 55.2776 ms)
[09/02/2020-15:17:48] [I] median: 55.2511 ms (end to end 55.2613 ms)
[09/02/2020-15:17:48] [I] percentile: 55.5354 ms at 99% (end to end 55.5509 ms at 99%)
[09/02/2020-15:17:48] [I] throughput: 18.0905 qps
[09/02/2020-15:17:48] [I] walltime: 3.09555 s
[09/02/2020-15:17:48] [I] Enqueue Time
[09/02/2020-15:17:48] [I] min: 2.00232 ms
[09/02/2020-15:17:48] [I] max: 2.57129 ms
[09/02/2020-15:17:48] [I] median: 2.21167 ms
[09/02/2020-15:17:48] [I] GPU Compute
[09/02/2020-15:17:48] [I] min: 54.9579 ms
[09/02/2020-15:17:48] [I] max: 55.3339 ms
[09/02/2020-15:17:48] [I] mean: 55.0662 ms
[09/02/2020-15:17:48] [I] median: 55.0502 ms
[09/02/2020-15:17:48] [I] percentile: 55.3339 ms at 99%
[09/02/2020-15:17:48] [I] total compute time: 3.0837 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx
Thanks.