The onnx model I have been shared in previous message.
- I tried check_model.py,but there are not any outputs.
- there is the output of trtexec :
&&&& RUNNING TensorRT.trtexec [TensorRT v8503] # ./trtexec --onnx=./yolov5s.onnx --saveEngine=y.trt
[03/25/2023-18:55:58] [I] === Model Options ===
[03/25/2023-18:55:58] [I] Format: ONNX
[03/25/2023-18:55:58] [I] Model: ./yolov5s.onnx
[03/25/2023-18:55:58] [I] Output:
[03/25/2023-18:55:58] [I] === Build Options ===
[03/25/2023-18:55:58] [I] Max batch: explicit batch
[03/25/2023-18:55:58] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[03/25/2023-18:55:58] [I] minTiming: 1
[03/25/2023-18:55:58] [I] avgTiming: 8
[03/25/2023-18:55:58] [I] Precision: FP32
[03/25/2023-18:55:58] [I] LayerPrecisions:
[03/25/2023-18:55:58] [I] Calibration:
[03/25/2023-18:55:58] [I] Refit: Disabled
[03/25/2023-18:55:58] [I] Sparsity: Disabled
[03/25/2023-18:55:58] [I] Safe mode: Disabled
[03/25/2023-18:55:58] [I] DirectIO mode: Disabled
[03/25/2023-18:55:58] [I] Restricted mode: Disabled
[03/25/2023-18:55:58] [I] Build only: Disabled
[03/25/2023-18:55:58] [I] Save engine: y.trt
[03/25/2023-18:55:58] [I] Load engine:
[03/25/2023-18:55:58] [I] Profiling verbosity: 0
[03/25/2023-18:55:58] [I] Tactic sources: Using default tactic sources
[03/25/2023-18:55:58] [I] timingCacheMode: local
[03/25/2023-18:55:58] [I] timingCacheFile:
[03/25/2023-18:55:58] [I] Heuristic: Disabled
[03/25/2023-18:55:58] [I] Preview Features: Use default preview flags.
[03/25/2023-18:55:58] [I] Input(s)s format: fp32:CHW
[03/25/2023-18:55:58] [I] Output(s)s format: fp32:CHW
[03/25/2023-18:55:58] [I] Input build shapes: model
[03/25/2023-18:55:58] [I] Input calibration shapes: model
[03/25/2023-18:55:58] [I] === System Options ===
[03/25/2023-18:55:58] [I] Device: 0
[03/25/2023-18:55:58] [I] DLACore:
[03/25/2023-18:55:58] [I] Plugins:
[03/25/2023-18:55:58] [I] === Inference Options ===
[03/25/2023-18:55:58] [I] Batch: Explicit
[03/25/2023-18:55:58] [I] Input inference shapes: model
[03/25/2023-18:55:58] [I] Iterations: 10
[03/25/2023-18:55:58] [I] Duration: 3s (+ 200ms warm up)
[03/25/2023-18:55:58] [I] Sleep time: 0ms
[03/25/2023-18:55:58] [I] Idle time: 0ms
[03/25/2023-18:55:58] [I] Streams: 1
[03/25/2023-18:55:58] [I] ExposeDMA: Disabled
[03/25/2023-18:55:58] [I] Data transfers: Enabled
[03/25/2023-18:55:58] [I] Spin-wait: Disabled
[03/25/2023-18:55:58] [I] Multithreading: Disabled
[03/25/2023-18:55:58] [I] CUDA Graph: Disabled
[03/25/2023-18:55:58] [I] Separate profiling: Disabled
[03/25/2023-18:55:58] [I] Time Deserialize: Disabled
[03/25/2023-18:55:58] [I] Time Refit: Disabled
[03/25/2023-18:55:58] [I] NVTX verbosity: 0
[03/25/2023-18:55:58] [I] Persistent Cache Ratio: 0
[03/25/2023-18:55:58] [I] Inputs:
[03/25/2023-18:55:58] [I] === Reporting Options ===
[03/25/2023-18:55:58] [I] Verbose: Disabled
[03/25/2023-18:55:58] [I] Averages: 10 inferences
[03/25/2023-18:55:58] [I] Percentiles: 90,95,99
[03/25/2023-18:55:58] [I] Dump refittable layers:Disabled
[03/25/2023-18:55:58] [I] Dump output: Disabled
[03/25/2023-18:55:58] [I] Profile: Disabled
[03/25/2023-18:55:58] [I] Export timing to JSON file:
[03/25/2023-18:55:58] [I] Export output to JSON file:
[03/25/2023-18:55:58] [I] Export profile to JSON file:
[03/25/2023-18:55:58] [I]
[03/25/2023-18:55:58] [I] === Device Information ===
[03/25/2023-18:55:58] [I] Selected Device: NVIDIA GeForce RTX 3060 Ti
[03/25/2023-18:55:58] [I] Compute Capability: 8.6
[03/25/2023-18:55:58] [I] SMs: 38
[03/25/2023-18:55:58] [I] Compute Clock Rate: 1.755 GHz
[03/25/2023-18:55:58] [I] Device Global Memory: 7965 MiB
[03/25/2023-18:55:58] [I] Shared Memory per SM: 100 KiB
[03/25/2023-18:55:58] [I] Memory Bus Width: 256 bits (ECC disabled)
[03/25/2023-18:55:58] [I] Memory Clock Rate: 7.001 GHz
[03/25/2023-18:55:58] [I]
[03/25/2023-18:55:58] [I] TensorRT version: 8.5.3
[03/25/2023-18:55:58] [I] [TRT] [MemUsageChange] Init CUDA: CPU +571, GPU +0, now: CPU 584, GPU 669 (MiB)
[03/25/2023-18:56:00] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +542, GPU +116, now: CPU 1178, GPU 785 (MiB)
[03/25/2023-18:56:00] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[03/25/2023-18:56:00] [I] Start parsing network model
[03/25/2023-18:56:00] [I] [TRT] ----------------------------------------------------------------
[03/25/2023-18:56:00] [I] [TRT] Input filename: ./yolov5s.onnx
[03/25/2023-18:56:00] [I] [TRT] ONNX IR version: 0.0.7
[03/25/2023-18:56:00] [I] [TRT] Opset version: 12
[03/25/2023-18:56:00] [I] [TRT] Producer name: pytorch
[03/25/2023-18:56:00] [I] [TRT] Producer version: 1.12.1
[03/25/2023-18:56:00] [I] [TRT] Domain:
[03/25/2023-18:56:00] [I] [TRT] Model version: 0
[03/25/2023-18:56:00] [I] [TRT] Doc string:
[03/25/2023-18:56:00] [I] [TRT] ----------------------------------------------------------------
[03/25/2023-18:56:00] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[03/25/2023-18:56:00] [I] Finish parsing network model
[03/25/2023-18:56:00] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1287, GPU +362, now: CPU 2498, GPU 1147 (MiB)
[03/25/2023-18:56:00] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +246, GPU +58, now: CPU 2744, GPU 1205 (MiB)
[03/25/2023-18:56:00] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[03/25/2023-18:57:20] [I] [TRT] [GraphReduction] The approximate region cut reduction algorithm is called.
[03/25/2023-18:57:20] [I] [TRT] Total Activation Memory: 8605349888
[03/25/2023-18:57:20] [I] [TRT] Detected 1 inputs and 4 output network tensors.
[03/25/2023-18:57:20] [I] [TRT] Total Host Persistent Memory: 84176
[03/25/2023-18:57:20] [I] [TRT] Total Device Persistent Memory: 231936
[03/25/2023-18:57:20] [I] [TRT] Total Scratch Memory: 134217728
[03/25/2023-18:57:20] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 7 MiB, GPU 4367 MiB
[03/25/2023-18:57:20] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 195 steps to complete.
[03/25/2023-18:57:20] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 18.1328ms to assign 15 blocks to 195 nodes requiring 156228096 bytes.
[03/25/2023-18:57:20] [I] [TRT] Total Activation Memory: 156228096
[03/25/2023-18:57:20] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 3594, GPU 1545 (MiB)
[03/25/2023-18:57:20] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +3, GPU +31, now: CPU 3, GPU 31 (MiB)
[03/25/2023-18:57:20] [I] Engine built in 81.9838 sec.
[03/25/2023-18:57:20] [I] [TRT] Loaded engine size: 31 MiB
[03/25/2023-18:57:20] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 3067, GPU 1411 (MiB)
[03/25/2023-18:57:20] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +30, now: CPU 0, GPU 30 (MiB)
[03/25/2023-18:57:20] [I] Engine deserialized in 0.0126936 sec.
[03/25/2023-18:57:20] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3067, GPU 1411 (MiB)
[03/25/2023-18:57:20] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +149, now: CPU 0, GPU 179 (MiB)
[03/25/2023-18:57:20] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[03/25/2023-18:57:20] [I] Setting persistentCacheLimit to 0 bytes.
[03/25/2023-18:57:20] [I] Using random values for input onnx::Cast_0
[03/25/2023-18:57:20] [I] Created input binding for onnx::Cast_0 with dimensions 1x3x640x640
[03/25/2023-18:57:20] [I] Using random values for output 630
[03/25/2023-18:57:20] [I] Created output binding for 630 with dimensions 1x25200x85
[03/25/2023-18:57:20] [I] Starting inference
[03/25/2023-18:57:23] [I] Warmup completed 66 queries over 200 ms
[03/25/2023-18:57:23] [I] Timing trace has 1004 queries over 3.01039 s
[03/25/2023-18:57:23] [I]
[03/25/2023-18:57:23] [I] === Trace details ===
[03/25/2023-18:57:23] [I] Trace averages of 10 runs:
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.00268 ms - Host latency: 3.64388 ms (enqueue 1.35774 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.00237 ms - Host latency: 3.64576 ms (enqueue 1.36225 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.00421 ms - Host latency: 3.63826 ms (enqueue 1.35391 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.00594 ms - Host latency: 3.62881 ms (enqueue 1.33792 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.00575 ms - Host latency: 3.63828 ms (enqueue 1.36564 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98281 ms - Host latency: 3.6037 ms (enqueue 1.35683 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98025 ms - Host latency: 3.61531 ms (enqueue 1.34596 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.03319 ms - Host latency: 3.67264 ms (enqueue 1.34447 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.12074 ms - Host latency: 3.74781 ms (enqueue 1.39609 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98557 ms - Host latency: 3.62251 ms (enqueue 1.35918 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97882 ms - Host latency: 3.60628 ms (enqueue 1.35126 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98005 ms - Host latency: 3.60111 ms (enqueue 1.34577 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98351 ms - Host latency: 3.61732 ms (enqueue 1.34944 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9821 ms - Host latency: 3.61712 ms (enqueue 1.34304 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98332 ms - Host latency: 3.63346 ms (enqueue 1.35293 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98221 ms - Host latency: 3.61263 ms (enqueue 1.33514 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98353 ms - Host latency: 3.61492 ms (enqueue 1.33793 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98065 ms - Host latency: 3.61238 ms (enqueue 1.35081 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9823 ms - Host latency: 3.61409 ms (enqueue 1.34128 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98219 ms - Host latency: 3.60547 ms (enqueue 1.34345 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98014 ms - Host latency: 3.61153 ms (enqueue 1.34828 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98393 ms - Host latency: 3.62723 ms (enqueue 1.34292 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98301 ms - Host latency: 3.61342 ms (enqueue 1.34311 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98271 ms - Host latency: 3.60798 ms (enqueue 1.3453 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98066 ms - Host latency: 3.61592 ms (enqueue 1.35807 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9826 ms - Host latency: 3.59905 ms (enqueue 1.35049 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98168 ms - Host latency: 3.60786 ms (enqueue 1.34774 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.02327 ms - Host latency: 3.65791 ms (enqueue 1.34291 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.13108 ms - Host latency: 3.76853 ms (enqueue 1.39824 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98169 ms - Host latency: 3.60955 ms (enqueue 1.34359 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98199 ms - Host latency: 3.63452 ms (enqueue 1.34982 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98077 ms - Host latency: 3.60051 ms (enqueue 1.34521 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98163 ms - Host latency: 3.60713 ms (enqueue 1.34738 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98363 ms - Host latency: 3.61511 ms (enqueue 1.34919 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98423 ms - Host latency: 3.6106 ms (enqueue 1.34349 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98262 ms - Host latency: 3.61565 ms (enqueue 1.33793 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98373 ms - Host latency: 3.61611 ms (enqueue 1.34597 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9823 ms - Host latency: 3.61178 ms (enqueue 1.35188 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97985 ms - Host latency: 3.5972 ms (enqueue 1.35369 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98279 ms - Host latency: 3.61038 ms (enqueue 1.34532 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98251 ms - Host latency: 3.60774 ms (enqueue 1.3387 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98199 ms - Host latency: 3.60171 ms (enqueue 1.32628 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98361 ms - Host latency: 3.62836 ms (enqueue 1.32795 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98353 ms - Host latency: 3.61538 ms (enqueue 1.33948 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97931 ms - Host latency: 3.61067 ms (enqueue 1.34053 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97964 ms - Host latency: 3.59666 ms (enqueue 1.34465 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98278 ms - Host latency: 3.61497 ms (enqueue 1.34336 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.0295 ms - Host latency: 3.65892 ms (enqueue 1.34763 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.09833 ms - Host latency: 3.72833 ms (enqueue 1.39119 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98231 ms - Host latency: 3.60315 ms (enqueue 1.35411 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98446 ms - Host latency: 3.61978 ms (enqueue 1.34379 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98309 ms - Host latency: 3.62393 ms (enqueue 1.34669 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98147 ms - Host latency: 3.61221 ms (enqueue 1.3424 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98341 ms - Host latency: 3.61759 ms (enqueue 1.34286 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98251 ms - Host latency: 3.61531 ms (enqueue 1.3425 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97858 ms - Host latency: 3.6001 ms (enqueue 1.35293 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98273 ms - Host latency: 3.6028 ms (enqueue 1.34374 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9835 ms - Host latency: 3.6109 ms (enqueue 1.34844 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98135 ms - Host latency: 3.61459 ms (enqueue 1.3537 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98126 ms - Host latency: 3.62198 ms (enqueue 1.34771 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98289 ms - Host latency: 3.61235 ms (enqueue 1.33959 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98165 ms - Host latency: 3.60614 ms (enqueue 1.34415 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98135 ms - Host latency: 3.6084 ms (enqueue 1.35081 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98179 ms - Host latency: 3.62878 ms (enqueue 1.31062 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9811 ms - Host latency: 3.60444 ms (enqueue 1.27759 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98188 ms - Host latency: 3.62107 ms (enqueue 1.28445 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98389 ms - Host latency: 3.61699 ms (enqueue 1.27905 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.02979 ms - Host latency: 3.65498 ms (enqueue 1.27449 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.11414 ms - Host latency: 3.74292 ms (enqueue 1.29788 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98218 ms - Host latency: 3.60635 ms (enqueue 1.36624 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98347 ms - Host latency: 3.60325 ms (enqueue 1.33906 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98401 ms - Host latency: 3.61831 ms (enqueue 1.34968 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98474 ms - Host latency: 3.61946 ms (enqueue 1.35396 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98296 ms - Host latency: 3.6145 ms (enqueue 1.34971 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98213 ms - Host latency: 3.60479 ms (enqueue 1.35981 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9842 ms - Host latency: 3.60222 ms (enqueue 1.43701 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98311 ms - Host latency: 3.62256 ms (enqueue 1.3583 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98306 ms - Host latency: 3.60178 ms (enqueue 1.34956 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98225 ms - Host latency: 3.61638 ms (enqueue 1.34988 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97942 ms - Host latency: 3.60867 ms (enqueue 1.34309 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98086 ms - Host latency: 3.61577 ms (enqueue 1.34182 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98215 ms - Host latency: 3.60823 ms (enqueue 1.34751 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98311 ms - Host latency: 3.60562 ms (enqueue 1.35125 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98088 ms - Host latency: 3.61096 ms (enqueue 1.3593 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98198 ms - Host latency: 3.60869 ms (enqueue 1.2707 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98159 ms - Host latency: 3.60981 ms (enqueue 1.34072 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98 ms - Host latency: 3.60945 ms (enqueue 1.3375 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.02385 ms - Host latency: 3.64678 ms (enqueue 1.34099 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 3.11975 ms - Host latency: 3.74973 ms (enqueue 1.39148 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98125 ms - Host latency: 3.61636 ms (enqueue 1.37041 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98145 ms - Host latency: 3.61848 ms (enqueue 1.35737 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98228 ms - Host latency: 3.60535 ms (enqueue 1.34041 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98037 ms - Host latency: 3.61711 ms (enqueue 1.34119 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9802 ms - Host latency: 3.60479 ms (enqueue 1.34941 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98218 ms - Host latency: 3.62153 ms (enqueue 1.34263 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98218 ms - Host latency: 3.61616 ms (enqueue 1.34993 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98235 ms - Host latency: 3.61348 ms (enqueue 1.35571 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.98098 ms - Host latency: 3.60767 ms (enqueue 1.33418 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.9832 ms - Host latency: 3.61348 ms (enqueue 1.33994 ms)
[03/25/2023-18:57:23] [I] Average on 10 runs - GPU latency: 2.97966 ms - Host latency: 3.6061 ms (enqueue 1.34731 ms)
[03/25/2023-18:57:23] [I]
[03/25/2023-18:57:23] [I] === Performance summary ===
[03/25/2023-18:57:23] [I] Throughput: 333.512 qps
[03/25/2023-18:57:23] [I] Latency: min = 3.55615 ms, max = 4.5824 ms, mean = 3.6227 ms, median = 3.61157 ms, percentile(90%) = 3.63989 ms, percentile(95%) = 3.65918 ms, percentile(99%) = 4.00616 ms
[03/25/2023-18:57:23] [I] Enqueue Time: min = 1.16992 ms, max = 1.65308 ms, mean = 1.34567 ms, median = 1.34741 ms, percentile(90%) = 1.37201 ms, percentile(95%) = 1.38245 ms, percentile(99%) = 1.44165 ms
[03/25/2023-18:57:23] [I] H2D Latency: min = 0.203857 ms, max = 0.229248 ms, mean = 0.209415 ms, median = 0.209229 ms, percentile(90%) = 0.212158 ms, percentile(95%) = 0.213135 ms, percentile(99%) = 0.215088 ms
[03/25/2023-18:57:23] [I] GPU Compute Time: min = 2.96533 ms, max = 3.95679 ms, mean = 2.99217 ms, median = 2.98169 ms, percentile(90%) = 2.99829 ms, percentile(95%) = 3.00543 ms, percentile(99%) = 3.38525 ms
[03/25/2023-18:57:23] [I] D2H Latency: min = 0.368896 ms, max = 0.645508 ms, mean = 0.421114 ms, median = 0.418671 ms, percentile(90%) = 0.439575 ms, percentile(95%) = 0.447998 ms, percentile(99%) = 0.525146 ms
[03/25/2023-18:57:23] [I] Total Host Walltime: 3.01039 s
[03/25/2023-18:57:23] [I] Total GPU Compute Time: 3.00414 s
[03/25/2023-18:57:23] [W] * GPU compute time is unstable, with coefficient of variance = 2.26049%.
[03/25/2023-18:57:23] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
[03/25/2023-18:57:23] [I] Explanations of the performance metrics are printed in the verbose logs.
[03/25/2023-18:57:23] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8503] # ./trtexec --onnx=./yolov5s.onnx --saveEngine=y.trt