Hi,
We just checked your new uploaded model with trtexec.
The model can run successfully with TensorRT (JetPack 4.5.1).
Could you double confirm if it also works in your environment?
$ /usr/src/tensorrt/bin/trtexec --onnx=version-RFB-640.onnx
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=version-RFB-640.onnx
[08/24/2021-12:12:37] [I] === Model Options ===
[08/24/2021-12:12:37] [I] Format: ONNX
[08/24/2021-12:12:37] [I] Model: version-RFB-640.onnx
[08/24/2021-12:12:37] [I] Output:
[08/24/2021-12:12:37] [I] === Build Options ===
[08/24/2021-12:12:37] [I] Max batch: 1
[08/24/2021-12:12:37] [I] Workspace: 16 MB
[08/24/2021-12:12:37] [I] minTiming: 1
[08/24/2021-12:12:37] [I] avgTiming: 8
[08/24/2021-12:12:37] [I] Precision: FP32
[08/24/2021-12:12:37] [I] Calibration:
[08/24/2021-12:12:37] [I] Safe mode: Disabled
[08/24/2021-12:12:37] [I] Save engine:
[08/24/2021-12:12:37] [I] Load engine:
[08/24/2021-12:12:37] [I] Builder Cache: Enabled
[08/24/2021-12:12:37] [I] NVTX verbosity: 0
[08/24/2021-12:12:37] [I] Inputs format: fp32:CHW
[08/24/2021-12:12:37] [I] Outputs format: fp32:CHW
[08/24/2021-12:12:37] [I] Input build shapes: model
[08/24/2021-12:12:37] [I] Input calibration shapes: model
[08/24/2021-12:12:37] [I] === System Options ===
[08/24/2021-12:12:37] [I] Device: 0
[08/24/2021-12:12:37] [I] DLACore:
[08/24/2021-12:12:37] [I] Plugins:
[08/24/2021-12:12:37] [I] === Inference Options ===
[08/24/2021-12:12:37] [I] Batch: 1
[08/24/2021-12:12:37] [I] Input inference shapes: model
[08/24/2021-12:12:37] [I] Iterations: 10
[08/24/2021-12:12:37] [I] Duration: 3s (+ 200ms warm up)
[08/24/2021-12:12:37] [I] Sleep time: 0ms
[08/24/2021-12:12:37] [I] Streams: 1
[08/24/2021-12:12:37] [I] ExposeDMA: Disabled
[08/24/2021-12:12:37] [I] Spin-wait: Disabled
[08/24/2021-12:12:37] [I] Multithreading: Disabled
[08/24/2021-12:12:37] [I] CUDA Graph: Disabled
[08/24/2021-12:12:37] [I] Skip inference: Disabled
[08/24/2021-12:12:37] [I] Inputs:
[08/24/2021-12:12:37] [I] === Reporting Options ===
[08/24/2021-12:12:37] [I] Verbose: Disabled
[08/24/2021-12:12:37] [I] Averages: 10 inferences
[08/24/2021-12:12:37] [I] Percentile: 99
[08/24/2021-12:12:37] [I] Dump output: Disabled
[08/24/2021-12:12:37] [I] Profile: Disabled
[08/24/2021-12:12:37] [I] Export timing to JSON file:
[08/24/2021-12:12:37] [I] Export output to JSON file:
[08/24/2021-12:12:37] [I] Export profile to JSON file:
[08/24/2021-12:12:37] [I]
----------------------------------------------------------------
Input filename: version-RFB-640.onnx
ONNX IR version: 0.0.4
Opset version: 9
Producer name: pytorch
Producer version: 1.3
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[08/24/2021-12:12:39] [I] [TRT]
[08/24/2021-12:12:39] [I] [TRT] --------------- Layers running on DLA:
[08/24/2021-12:12:39] [I] [TRT]
[08/24/2021-12:12:39] [I] [TRT] --------------- Layers running on GPU:
[08/24/2021-12:12:39] [I] [TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation], (Unnamed Layer* 3) [Convolution] + (Unnamed Layer* 5) [Activation], (Unnamed Layer* 6) [Convolution] + (Unnamed Layer* 8) [Activation], (Unnamed Layer* 9) [Convolution] + (Unnamed Layer* 11) [Activation], (Unnamed Layer* 12) [Convolution] + (Unnamed Layer* 14) [Activation], (Unnamed Layer* 15) [Convolution] + (Unnamed Layer* 17) [Activation], (Unnamed Layer* 18) [Convolution] + (Unnamed Layer* 20) [Activation], (Unnamed Layer* 21) [Convolution] + (Unnamed Layer* 23) [Activation], (Unnamed Layer* 24) [Convolution] + (Unnamed Layer* 26) [Activation], (Unnamed Layer* 27) [Convolution] + (Unnamed Layer* 29) [Activation], (Unnamed Layer* 30) [Convolution] + (Unnamed Layer* 32) [Activation], (Unnamed Layer* 33) [Convolution] + (Unnamed Layer* 35) [Activation], (Unnamed Layer* 36) [Convolution] + (Unnamed Layer* 38) [Activation], (Unnamed Layer* 53) [Convolution] || (Unnamed Layer* 46) [Convolution] || (Unnamed Layer* 39) [Convolution], (Unnamed Layer* 41) [Convolution] + (Unnamed Layer* 43) [Activation], (Unnamed Layer* 48) [Convolution] + (Unnamed Layer* 50) [Activation], (Unnamed Layer* 55) [Convolution] + (Unnamed Layer* 57) [Activation], (Unnamed Layer* 58) [Convolution] + (Unnamed Layer* 60) [Activation], (Unnamed Layer* 51) [Convolution], (Unnamed Layer* 44) [Convolution], (Unnamed Layer* 61) [Convolution], (Unnamed Layer* 64) [Convolution], (Unnamed Layer* 66) [Convolution] + (Unnamed Layer* 68) [ElementWise] + (Unnamed Layer* 69) [Activation], (Unnamed Layer* 98) [Convolution] + (Unnamed Layer* 100) [Activation], (Unnamed Layer* 84) [Convolution] + (Unnamed Layer* 85) [Activation], (Unnamed Layer* 70) [Convolution] + (Unnamed Layer* 71) [Activation], (Unnamed Layer* 72) [Convolution], (Unnamed Layer* 73) [Shuffle] + (Unnamed Layer* 83) [Shuffle], (Unnamed Layer* 86) [Convolution], (Unnamed Layer* 87) [Shuffle] + (Unnamed Layer* 97) [Shuffle], (Unnamed Layer* 101) [Convolution] + (Unnamed Layer* 103) [Activation], (Unnamed Layer* 104) [Convolution] + (Unnamed Layer* 106) [Activation], (Unnamed Layer* 107) [Convolution] + (Unnamed Layer* 109) [Activation], (Unnamed Layer* 110) [Convolution] + (Unnamed Layer* 112) [Activation], (Unnamed Layer* 113) [Convolution] + (Unnamed Layer* 115) [Activation], (Unnamed Layer* 144) [Convolution] + (Unnamed Layer* 146) [Activation], (Unnamed Layer* 130) [Convolution] + (Unnamed Layer* 131) [Activation], (Unnamed Layer* 116) [Convolution] + (Unnamed Layer* 117) [Activation], (Unnamed Layer* 118) [Convolution], (Unnamed Layer* 119) [Shuffle] + (Unnamed Layer* 129) [Shuffle], (Unnamed Layer* 132) [Convolution], (Unnamed Layer* 133) [Shuffle] + (Unnamed Layer* 143) [Shuffle], (Unnamed Layer* 147) [Convolution] + (Unnamed Layer* 149) [Activation], (Unnamed Layer* 150) [Convolution] + (Unnamed Layer* 152) [Activation], (Unnamed Layer* 153) [Convolution] + (Unnamed Layer* 155) [Activation], (Unnamed Layer* 184) [Convolution] + (Unnamed Layer* 185) [Activation], (Unnamed Layer* 170) [Convolution] + (Unnamed Layer* 171) [Activation], (Unnamed Layer* 156) [Convolution] + (Unnamed Layer* 157) [Activation], (Unnamed Layer* 158) [Convolution], (Unnamed Layer* 159) [Shuffle] + (Unnamed Layer* 169) [Shuffle], (Unnamed Layer* 172) [Convolution], (Unnamed Layer* 173) [Shuffle] + (Unnamed Layer* 183) [Shuffle], (Unnamed Layer* 186) [Convolution] + (Unnamed Layer* 187) [Activation], (Unnamed Layer* 188) [Convolution] + (Unnamed Layer* 189) [Activation], (Unnamed Layer* 202) [Convolution] || (Unnamed Layer* 190) [Convolution], (Unnamed Layer* 203) [Shuffle] + (Unnamed Layer* 213) [Shuffle], 342 copy, 388 copy, 428 copy, 458 copy, (Unnamed Layer* 285) [Slice], (Unnamed Layer* 253) [Slice], (Unnamed Layer* 191) [Shuffle] + (Unnamed Layer* 201) [Shuffle], 328 copy, 374 copy, 414 copy, 446 copy, (Unnamed Layer* 225) [Shuffle], (Unnamed Layer* 226) [Softmax], (Unnamed Layer* 227) [Shuffle], (Unnamed Layer* 257) [Constant], (Unnamed Layer* 290) [Constant], PWN(PWN(PWN((Unnamed Layer* 286) [Constant] + (Unnamed Layer* 287) [Shuffle], (Unnamed Layer* 288) [ElementWise]), (Unnamed Layer* 289) [Unary]), (Unnamed Layer* 291) [ElementWise]), (Unnamed Layer* 259) [Constant], PWN(PWN(PWN((Unnamed Layer* 254) [Constant] + (Unnamed Layer* 255) [Shuffle], (Unnamed Layer* 256) [ElementWise]), (Unnamed Layer* 258) [ElementWise]), (Unnamed Layer* 260) [ElementWise]), 468 copy, 474 copy, (Unnamed Layer* 396) [Slice], (Unnamed Layer* 371) [Slice], (Unnamed Layer* 342) [Slice], (Unnamed Layer* 317) [Slice], PWN(PWN((Unnamed Layer* 397) [Constant] + (Unnamed Layer* 398) [Shuffle], (Unnamed Layer* 399) [ElementWise]), (Unnamed Layer* 400) [ElementWise]), PWN(PWN((Unnamed Layer* 343) [Constant] + (Unnamed Layer* 344) [Shuffle], (Unnamed Layer* 345) [ElementWise]), (Unnamed Layer* 346) [ElementWise]), 480 copy, 485 copy,
[08/24/2021-12:12:47] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[08/24/2021-12:13:43] [I] [TRT] Detected 1 inputs and 4 output network tensors.
[08/24/2021-12:13:43] [I] Starting inference threads
[08/24/2021-12:13:46] [I] Warmup completed 53 queries over 200 ms
[08/24/2021-12:13:46] [I] Timing trace has 786 queries over 3.00927 s
[08/24/2021-12:13:46] [I] Trace averages of 10 runs:
[08/24/2021-12:13:46] [I] Average on 10 runs - GPU latency: 3.6301 ms - Host latency: 3.76711 ms (end to end 3.77763 ms, enqueue 1.52328 ms)
...
[08/24/2021-12:13:46] [I] Average on 10 runs - GPU latency: 3.67217 ms - Host latency: 3.81064 ms (end to end 3.82107 ms, enqueue 1.36304 ms)
[08/24/2021-12:13:46] [I] Host Latency
[08/24/2021-12:13:46] [I] min: 3.74438 ms (end to end 3.75781 ms)
[08/24/2021-12:13:46] [I] max: 3.88342 ms (end to end 3.89673 ms)
[08/24/2021-12:13:46] [I] mean: 3.8181 ms (end to end 3.82853 ms)
[08/24/2021-12:13:46] [I] median: 3.82004 ms (end to end 3.82956 ms)
[08/24/2021-12:13:46] [I] percentile: 3.86523 ms at 99% (end to end 3.87598 ms at 99%)
[08/24/2021-12:13:46] [I] throughput: 261.193 qps
[08/24/2021-12:13:46] [I] walltime: 3.00927 s
[08/24/2021-12:13:46] [I] Enqueue Time
[08/24/2021-12:13:46] [I] min: 1.16895 ms
[08/24/2021-12:13:46] [I] max: 2.15942 ms
[08/24/2021-12:13:46] [I] median: 1.38501 ms
[08/24/2021-12:13:46] [I] GPU Compute
[08/24/2021-12:13:46] [I] min: 3.60962 ms
[08/24/2021-12:13:46] [I] max: 3.74109 ms
[08/24/2021-12:13:46] [I] mean: 3.67907 ms
[08/24/2021-12:13:46] [I] median: 3.68103 ms
[08/24/2021-12:13:46] [I] percentile: 3.72363 ms at 99%
[08/24/2021-12:13:46] [I] total compute time: 2.89175 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=version-RFB-640.onnx
Thanks.