YOLOv5 QAT model inference empty && pytorch-quantization-toolkit

Description

I used the pytorch quantification toolkit to fine tune the qat of yolov5, an epoch, and successfully generated a Q / DQ onnx model. I also added a yololayer_ TRT’s user-defined operator, and then use . / trtexec -- onnx = yolov5s-5.0-pre-yolo-op.onnx -- workspace = 10240 -- int8 -- saveengine = yolov5s-5.0-pre-fp16. Engine -- plugins = libyolo.so to generate an engine that tensorrt can infer, but the tensorrt inference result is empty?

Environment

TensorRT Version: 8.0
GPU Type: T4
Nvidia Driver Version: 460.91.03
CUDA Version: 11.4
CUDNN Version: 8.0.4
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): python 3.8
TensorFlow Version (if applicable): None
PyTorch Version (if applicable): 1.10.0
Baremetal or Container (if container which image + tag):

Question:
I used the pytorch quantification toolkit to fine tune the qat of yolov5, an epoch, and successfully generated a Q / DQ onnx model. I also added a yololayer_ TRT’s user-defined operator, and then use . / trtexec -- onnx = yolov5s-5.0-pre-yolo-op.onnx -- workspace = 10240 -- int8 -- saveengine = yolov5s-5.0-pre-fp16. Engine -- plugins = libyolo.so to generate an engine that tensorrt can infer, but the tensorrt inference result is empty?

my onnx is

my engine is

Now my engine reasoning is successful, but the reasoning result is empty??

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html#onnx-export

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Thank you for your reply
This onnx model (backbone + neck)
yolov5s-6.0-qat.onnx (27.7 MB)

and this onnx model(backbone + neck + 3x detection head(YoloLayer_TRT plugin))
yolov5s-6.0-qat-yolo-op.onnx (27.7 MB)

and this is log when i build trt engine:

root@d741691190a8:/workspace/tensorrt/bin# ./trtexec --onnx=/root/workspace/onnx/yolov5s-6.0-qat-yolo-op.onnx --workspace=10240 --int8 --saveEngine=/root/yolov5s-6.0-qat-int8.engine --plugins=/root/workspace/plugins/YoloLayer_TRT_v6.0/build/libyolo.so                               &&&& RUNNING TensorRT.trtexec [TensorRT v8003] # ./trtexec --onnx=/root/workspace/onnx/yolov5s-6.0-qat.onnx --workspace=10240 --int8 --saveEngine=/root/yolov5s-6.0-qat-int8.engine --plugins=/root/workspace/plugins/YoloLayer_TRT_v6.0/build/libyolo.so
[11/15/2021-10:37:08] [I] === Model Options ===
[11/15/2021-10:37:08] [I] Format: ONNX
[11/15/2021-10:37:08] [I] Model: /root/workspace/onnx/yolov5s-6.0-qat-yolo-op.onnx
[11/15/2021-10:37:08] [I] Output:
[11/15/2021-10:37:08] [I] === Build Options ===
[11/15/2021-10:37:08] [I] Max batch: explicit
[11/15/2021-10:37:08] [I] Workspace: 10240 MiB
[11/15/2021-10:37:08] [I] minTiming: 1
[11/15/2021-10:37:08] [I] avgTiming: 8
[11/15/2021-10:37:08] [I] Precision: FP32+INT8
[11/15/2021-10:37:08] [I] Calibration: Dynamic
[11/15/2021-10:37:08] [I] Refit: Disabled
[11/15/2021-10:37:08] [I] Sparsity: Disabled
[11/15/2021-10:37:08] [I] Safe mode: Disabled
[11/15/2021-10:37:08] [I] Restricted mode: Disabled
[11/15/2021-10:37:08] [I] Save engine: /root/yolov5s-6.0-qat-int8.engine
[11/15/2021-10:37:08] [I] Load engine:
[11/15/2021-10:37:08] [I] NVTX verbosity: 0
[11/15/2021-10:37:08] [I] Tactic sources: Using default tactic sources
[11/15/2021-10:37:08] [I] timingCacheMode: local
[11/15/2021-10:37:08] [I] timingCacheFile:
[11/15/2021-10:37:08] [I] Input(s)s format: fp32:CHW
[11/15/2021-10:37:08] [I] Output(s)s format: fp32:CHW
[11/15/2021-10:37:08] [I] Input build shapes: model
[11/15/2021-10:37:08] [I] Input calibration shapes: model
[11/15/2021-10:37:08] [I] === System Options ===
[11/15/2021-10:37:08] [I] Device: 0
[11/15/2021-10:37:08] [I] DLACore:
[11/15/2021-10:37:08] [I] Plugins: /root/workspace/plugins/YoloLayer_TRT_v6.0/build/libyolo.so
[11/15/2021-10:37:08] [I] === Inference Options ===
[11/15/2021-10:37:08] [I] Batch: Explicit
[11/15/2021-10:37:08] [I] Input inference shapes: model
[11/15/2021-10:37:08] [I] Iterations: 10
[11/15/2021-10:37:08] [I] Duration: 3s (+ 200ms warm up)
[11/15/2021-10:37:08] [I] Sleep time: 0ms
[11/15/2021-10:37:08] [I] Streams: 1
[11/15/2021-10:37:08] [I] ExposeDMA: Disabled
[11/15/2021-10:37:08] [I] Data transfers: Enabled
[11/15/2021-10:37:08] [I] Spin-wait: Disabled
[11/15/2021-10:37:08] [I] Multithreading: Disabled
[11/15/2021-10:37:08] [I] CUDA Graph: Disabled
[11/15/2021-10:37:08] [I] Separate profiling: Disabled
[11/15/2021-10:37:08] [I] Time Deserialize: Disabled
[11/15/2021-10:37:08] [I] Time Refit: Disabled
[11/15/2021-10:37:08] [I] Skip inference: Disabled
[11/15/2021-10:37:08] [I] Inputs:
[11/15/2021-10:37:08] [I] === Reporting Options ===
[11/15/2021-10:37:08] [I] Verbose: Disabled
[11/15/2021-10:37:08] [I] Averages: 10 inferences
[11/15/2021-10:37:08] [I] Percentile: 99
[11/15/2021-10:37:08] [I] Dump refittable layers:Disabled
[11/15/2021-10:37:08] [I] Dump output: Disabled
[11/15/2021-10:37:08] [I] Profile: Disabled
[11/15/2021-10:37:08] [I] Export timing to JSON file:
[11/15/2021-10:37:08] [I] Export output to JSON file:
[11/15/2021-10:37:08] [I] Export profile to JSON file:
[11/15/2021-10:37:08] [I]
[11/15/2021-10:37:08] [I] === Device Information ===
[11/15/2021-10:37:08] [I] Selected Device: Tesla T4
[11/15/2021-10:37:08] [I] Compute Capability: 7.5
[11/15/2021-10:37:08] [I] SMs: 40
[11/15/2021-10:37:08] [I] Compute Clock Rate: 1.59 GHz
[11/15/2021-10:37:08] [I] Device Global Memory: 15109 MiB
[11/15/2021-10:37:08] [I] Shared Memory per SM: 64 KiB
[11/15/2021-10:37:08] [I] Memory Bus Width: 256 bits (ECC enabled)
[11/15/2021-10:37:08] [I] Memory Clock Rate: 5.001 GHz
[11/15/2021-10:37:08] [I]
[11/15/2021-10:37:08] [I] TensorRT version: 8003
[11/15/2021-10:37:08] [I] Loading supplied plugin library: /root/workspace/plugins/YoloLayer_TRT_v6.0/build/libyolo.so
[11/15/2021-10:37:08] [I] [TRT] [MemUsageChange] Init CUDA: CPU +328, GPU +0, now: CPU 335, GPU 1083 (MiB)
[11/15/2021-10:37:08] [I] Start parsing network model
[11/15/2021-10:37:08] [I] [TRT] ----------------------------------------------------------------
[11/15/2021-10:37:08] [I] [TRT] Input filename:   /root/workspace/onnx/yolov5s-6.0-qat-yolo-op.onnx
[11/15/2021-10:37:08] [I] [TRT] ONNX IR version:  0.0.7
[11/15/2021-10:37:08] [I] [TRT] Opset version:    12
[11/15/2021-10:37:08] [I] [TRT] Producer name:    pytorch
[11/15/2021-10:37:08] [I] [TRT] Producer version: 1.10
[11/15/2021-10:37:08] [I] [TRT] Domain:
[11/15/2021-10:37:08] [I] [TRT] Model version:    0
[11/15/2021-10:37:08] [I] [TRT] Doc string:
[11/15/2021-10:37:08] [I] [TRT] ----------------------------------------------------------------
[11/15/2021-10:37:09] [I] [TRT] No importer registered for op: YoloLayer_TRT. Attempting to import as plugin.
[11/15/2021-10:37:09] [I] [TRT] Searching for plugin: YoloLayer_TRT, plugin_version: 1, plugin_namespace:
[11/15/2021-10:37:09] [I] [TRT] Successfully created plugin: YoloLayer_TRT
[11/15/2021-10:37:09] [I] Finish parsing network model
[11/15/2021-10:37:09] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 367, GPU 1085 (MiB)
[11/15/2021-10:37:09] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
[11/15/2021-10:37:09] [I] [TRT] [MemUsageSnapshot] Builder begin: CPU 367 MiB, GPU 1091 MiB
[11/15/2021-10:37:09] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
[11/15/2021-10:37:11] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +496, GPU +212, now: CPU 891, GPU 1303 (MiB)
[11/15/2021-10:37:11] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +169, GPU +204, now: CPU 1060, GPU 1507 (MiB)
[11/15/2021-10:37:11] [W] [TRT] Detected invalid timing cache, setup a local cache instead
[11/15/2021-10:38:51] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[11/15/2021-10:38:53] [I] [TRT] Total Host Persistent Memory: 131200
[11/15/2021-10:38:53] [I] [TRT] Total Device Persistent Memory: 9769472
[11/15/2021-10:38:53] [I] [TRT] Total Scratch Memory: 0
[11/15/2021-10:38:53] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 34 MiB, GPU 4 MiB
[11/15/2021-10:38:53] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1078, GPU 1529 (MiB)
[11/15/2021-10:38:53] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 1078, GPU 1539 (MiB)
[11/15/2021-10:38:53] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1078, GPU 1523 (MiB)
[11/15/2021-10:38:53] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1078, GPU 1507 (MiB)
[11/15/2021-10:38:53] [I] [TRT] [MemUsageSnapshot] Builder end: CPU 1050 MiB, GPU 1507 MiB
[11/15/2021-10:38:53] [I] [TRT] Loaded engine size: 18 MB
[11/15/2021-10:38:53] [I] [TRT] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 1054 MiB, GPU 1495 MiB
[11/15/2021-10:38:54] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1063, GPU 1515 (MiB)
[11/15/2021-10:38:54] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1063, GPU 1523 (MiB)
[11/15/2021-10:38:54] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1063, GPU 1507 (MiB)
[11/15/2021-10:38:54] [I] [TRT] [MemUsageSnapshot] deserializeCudaEngine end: CPU 1063 MiB, GPU 1507 MiB
[11/15/2021-10:38:54] [I] Engine built in 105.872 sec.
[11/15/2021-10:38:54] [I] [TRT] [MemUsageSnapshot] ExecutionContext creation begin: CPU 1013 MiB, GPU 1501 MiB
[11/15/2021-10:38:54] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1013, GPU 1509 (MiB)
[11/15/2021-10:38:54] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1013, GPU 1517 (MiB)
[11/15/2021-10:38:54] [I] [TRT] [MemUsageSnapshot] ExecutionContext creation end: CPU 1013 MiB, GPU 1541 MiB
[11/15/2021-10:38:54] [I] Created input binding for inputs.1 with dimensions 1x3x640x640
[11/15/2021-10:38:54] [I] Created output binding for output with dimensions 1x6001x1x1
[11/15/2021-10:38:54] [I] Starting inference
[11/15/2021-10:38:57] [I] Warmup completed 61 queries over 200 ms
[11/15/2021-10:38:57] [I] Timing trace has 1685 queries over 3.00539 s
[11/15/2021-10:38:57] [I]
[11/15/2021-10:38:57] [I] === Trace details ===
[11/15/2021-10:38:57] [I] Trace averages of 10 runs:
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.46257 ms - Host latency: 2.88812 ms (end to end 4.68096 ms, enqueue 0.908342 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.47009 ms - Host latency: 2.89539 ms (end to end 4.70007 ms, enqueue 0.904158 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.45307 ms - Host latency: 2.87979 ms (end to end 4.67335 ms, enqueue 0.906158 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.46927 ms - Host latency: 2.89522 ms (end to end 4.68937 ms, enqueue 0.922183 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.39604 ms - Host latency: 2.81891 ms (end to end 4.27952 ms, enqueue 0.926205 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.47107 ms - Host latency: 2.90086 ms (end to end 4.70127 ms, enqueue 0.913327 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.45857 ms - Host latency: 2.88517 ms (end to end 4.67025 ms, enqueue 0.924789 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 2.43994 ms - Host latency: 2.86543 ms (end to end 4.44682 ms, enqueue 0.915842 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75262 ms - Host latency: 2.17934 ms (end to end 3.32601 ms, enqueue 0.896097 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70791 ms - Host latency: 2.13787 ms (end to end 3.18519 ms, enqueue 0.808716 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70157 ms - Host latency: 2.12982 ms (end to end 3.16102 ms, enqueue 0.800952 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72601 ms - Host latency: 2.14966 ms (end to end 3.23792 ms, enqueue 0.754947 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71359 ms - Host latency: 2.14054 ms (end to end 3.18638 ms, enqueue 0.813907 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72821 ms - Host latency: 2.15646 ms (end to end 3.21668 ms, enqueue 0.799905 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71794 ms - Host latency: 2.14929 ms (end to end 3.19819 ms, enqueue 0.804492 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72185 ms - Host latency: 2.15359 ms (end to end 3.19664 ms, enqueue 0.791705 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73903 ms - Host latency: 2.17161 ms (end to end 3.2324 ms, enqueue 0.828564 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7116 ms - Host latency: 2.13887 ms (end to end 3.21057 ms, enqueue 0.761945 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71721 ms - Host latency: 2.1476 ms (end to end 3.21354 ms, enqueue 0.78523 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73155 ms - Host latency: 2.16293 ms (end to end 3.25398 ms, enqueue 0.76814 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72757 ms - Host latency: 2.16324 ms (end to end 3.23115 ms, enqueue 0.80824 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73378 ms - Host latency: 2.16926 ms (end to end 3.22949 ms, enqueue 0.767999 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72326 ms - Host latency: 2.16119 ms (end to end 3.22219 ms, enqueue 0.796423 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70734 ms - Host latency: 2.14218 ms (end to end 3.23209 ms, enqueue 0.808447 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71291 ms - Host latency: 2.14811 ms (end to end 3.26492 ms, enqueue 0.760291 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73419 ms - Host latency: 2.17388 ms (end to end 3.25743 ms, enqueue 0.788818 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74154 ms - Host latency: 2.18034 ms (end to end 3.25358 ms, enqueue 0.785052 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73814 ms - Host latency: 2.18118 ms (end to end 3.24785 ms, enqueue 0.827289 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73383 ms - Host latency: 2.17649 ms (end to end 3.22078 ms, enqueue 0.808478 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73724 ms - Host latency: 2.17607 ms (end to end 3.24731 ms, enqueue 0.791174 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73226 ms - Host latency: 2.17265 ms (end to end 3.22481 ms, enqueue 0.811847 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72568 ms - Host latency: 2.16622 ms (end to end 3.1043 ms, enqueue 0.809778 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72687 ms - Host latency: 2.16203 ms (end to end 3.19713 ms, enqueue 0.794348 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70698 ms - Host latency: 2.15123 ms (end to end 3.18903 ms, enqueue 0.783514 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7009 ms - Host latency: 2.13716 ms (end to end 3.18851 ms, enqueue 0.771063 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71765 ms - Host latency: 2.15759 ms (end to end 3.20422 ms, enqueue 0.764685 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73049 ms - Host latency: 2.17151 ms (end to end 3.2319 ms, enqueue 0.809314 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72497 ms - Host latency: 2.16329 ms (end to end 3.22203 ms, enqueue 0.781958 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74196 ms - Host latency: 2.17874 ms (end to end 3.25093 ms, enqueue 0.782599 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74083 ms - Host latency: 2.17872 ms (end to end 3.25724 ms, enqueue 0.793689 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71788 ms - Host latency: 2.15322 ms (end to end 3.2288 ms, enqueue 0.777338 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7041 ms - Host latency: 2.13721 ms (end to end 3.22368 ms, enqueue 0.776337 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70598 ms - Host latency: 2.13944 ms (end to end 3.2342 ms, enqueue 0.772162 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7194 ms - Host latency: 2.15726 ms (end to end 3.21677 ms, enqueue 0.799792 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71606 ms - Host latency: 2.15634 ms (end to end 3.19388 ms, enqueue 0.781934 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72277 ms - Host latency: 2.15841 ms (end to end 3.213 ms, enqueue 0.791626 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71848 ms - Host latency: 2.15723 ms (end to end 3.20485 ms, enqueue 0.809058 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74012 ms - Host latency: 2.17684 ms (end to end 3.24817 ms, enqueue 0.805762 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73181 ms - Host latency: 2.17139 ms (end to end 3.14062 ms, enqueue 0.802454 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72764 ms - Host latency: 2.16294 ms (end to end 3.19863 ms, enqueue 0.776746 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74413 ms - Host latency: 2.18441 ms (end to end 3.26802 ms, enqueue 0.798413 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70585 ms - Host latency: 2.14016 ms (end to end 3.03787 ms, enqueue 0.78728 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7248 ms - Host latency: 2.1608 ms (end to end 3.22479 ms, enqueue 0.783289 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.721 ms - Host latency: 2.1605 ms (end to end 3.20961 ms, enqueue 0.805579 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71895 ms - Host latency: 2.15479 ms (end to end 3.09076 ms, enqueue 0.762048 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72096 ms - Host latency: 2.15941 ms (end to end 3.213 ms, enqueue 0.812122 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73877 ms - Host latency: 2.17378 ms (end to end 3.25221 ms, enqueue 0.725769 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71537 ms - Host latency: 2.15378 ms (end to end 3.19872 ms, enqueue 0.80658 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73531 ms - Host latency: 2.17061 ms (end to end 3.22424 ms, enqueue 0.776575 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.78342 ms - Host latency: 2.22501 ms (end to end 3.35361 ms, enqueue 0.787805 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.77834 ms - Host latency: 2.21953 ms (end to end 3.325 ms, enqueue 0.817163 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.78391 ms - Host latency: 2.21887 ms (end to end 3.33635 ms, enqueue 0.739929 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75836 ms - Host latency: 2.19734 ms (end to end 3.28196 ms, enqueue 0.795691 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75923 ms - Host latency: 2.19419 ms (end to end 3.30419 ms, enqueue 0.780859 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74238 ms - Host latency: 2.18334 ms (end to end 3.25237 ms, enqueue 0.780078 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7377 ms - Host latency: 2.17584 ms (end to end 3.24358 ms, enqueue 0.813245 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74385 ms - Host latency: 2.18247 ms (end to end 3.25457 ms, enqueue 0.786707 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74905 ms - Host latency: 2.18354 ms (end to end 3.26614 ms, enqueue 0.762488 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75084 ms - Host latency: 2.1896 ms (end to end 3.25221 ms, enqueue 0.788098 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72946 ms - Host latency: 2.1696 ms (end to end 3.23962 ms, enqueue 0.814478 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73198 ms - Host latency: 2.17209 ms (end to end 3.2285 ms, enqueue 0.78291 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73817 ms - Host latency: 2.17472 ms (end to end 3.24117 ms, enqueue 0.782507 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76046 ms - Host latency: 2.19462 ms (end to end 3.30406 ms, enqueue 0.756299 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74453 ms - Host latency: 2.18051 ms (end to end 3.25906 ms, enqueue 0.795117 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75411 ms - Host latency: 2.19076 ms (end to end 3.29264 ms, enqueue 0.782178 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72218 ms - Host latency: 2.15886 ms (end to end 3.22402 ms, enqueue 0.777478 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71416 ms - Host latency: 2.1495 ms (end to end 3.21195 ms, enqueue 0.774658 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74603 ms - Host latency: 2.18376 ms (end to end 3.27072 ms, enqueue 0.774426 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73496 ms - Host latency: 2.17218 ms (end to end 3.23713 ms, enqueue 0.760938 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73494 ms - Host latency: 2.17576 ms (end to end 3.24827 ms, enqueue 0.796875 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73944 ms - Host latency: 2.18206 ms (end to end 3.24739 ms, enqueue 0.818616 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73345 ms - Host latency: 2.17292 ms (end to end 3.22999 ms, enqueue 0.805249 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7668 ms - Host latency: 2.20939 ms (end to end 3.28459 ms, enqueue 0.816943 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7912 ms - Host latency: 2.23387 ms (end to end 3.34916 ms, enqueue 0.79812 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.77559 ms - Host latency: 2.21742 ms (end to end 3.32604 ms, enqueue 0.826233 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.77119 ms - Host latency: 2.21051 ms (end to end 3.30624 ms, enqueue 0.791223 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76114 ms - Host latency: 2.19991 ms (end to end 3.29858 ms, enqueue 0.790479 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75033 ms - Host latency: 2.18738 ms (end to end 3.25836 ms, enqueue 0.800879 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70786 ms - Host latency: 2.14562 ms (end to end 3.21227 ms, enqueue 0.776501 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70946 ms - Host latency: 2.15382 ms (end to end 3.18446 ms, enqueue 0.801501 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72653 ms - Host latency: 2.16436 ms (end to end 3.22648 ms, enqueue 0.77179 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74288 ms - Host latency: 2.18427 ms (end to end 3.24706 ms, enqueue 0.795288 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73682 ms - Host latency: 2.17751 ms (end to end 3.23599 ms, enqueue 0.818542 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72874 ms - Host latency: 2.16743 ms (end to end 3.22871 ms, enqueue 0.787537 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71444 ms - Host latency: 2.15433 ms (end to end 3.18534 ms, enqueue 0.813879 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71544 ms - Host latency: 2.15676 ms (end to end 3.18699 ms, enqueue 0.834766 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72136 ms - Host latency: 2.16114 ms (end to end 3.18674 ms, enqueue 0.788281 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74988 ms - Host latency: 2.18982 ms (end to end 3.26364 ms, enqueue 0.790784 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74419 ms - Host latency: 2.18225 ms (end to end 3.26066 ms, enqueue 0.814709 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71771 ms - Host latency: 2.15283 ms (end to end 3.2293 ms, enqueue 0.768884 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72081 ms - Host latency: 2.15642 ms (end to end 3.24896 ms, enqueue 0.777563 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70541 ms - Host latency: 2.13822 ms (end to end 3.20719 ms, enqueue 0.742102 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70753 ms - Host latency: 2.14305 ms (end to end 3.22402 ms, enqueue 0.76969 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73838 ms - Host latency: 2.17627 ms (end to end 3.26338 ms, enqueue 0.785083 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7406 ms - Host latency: 2.18411 ms (end to end 3.25 ms, enqueue 0.795068 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74661 ms - Host latency: 2.18625 ms (end to end 3.26609 ms, enqueue 0.776318 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7292 ms - Host latency: 2.17002 ms (end to end 3.21924 ms, enqueue 0.830469 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7363 ms - Host latency: 2.17627 ms (end to end 3.23652 ms, enqueue 0.773511 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71931 ms - Host latency: 2.15735 ms (end to end 3.21343 ms, enqueue 0.79646 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73037 ms - Host latency: 2.17222 ms (end to end 3.2408 ms, enqueue 0.787695 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75481 ms - Host latency: 2.1916 ms (end to end 3.28757 ms, enqueue 0.74873 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75125 ms - Host latency: 2.19097 ms (end to end 3.25464 ms, enqueue 0.811353 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76099 ms - Host latency: 2.19775 ms (end to end 3.30613 ms, enqueue 0.753589 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75195 ms - Host latency: 2.1896 ms (end to end 3.28555 ms, enqueue 0.803369 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75168 ms - Host latency: 2.18955 ms (end to end 3.27085 ms, enqueue 0.778052 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73992 ms - Host latency: 2.1772 ms (end to end 3.24905 ms, enqueue 0.794873 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74214 ms - Host latency: 2.18108 ms (end to end 3.25603 ms, enqueue 0.768677 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76194 ms - Host latency: 2.20027 ms (end to end 3.28638 ms, enqueue 0.798462 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75933 ms - Host latency: 2.20205 ms (end to end 3.27732 ms, enqueue 0.800757 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76143 ms - Host latency: 2.19922 ms (end to end 3.2865 ms, enqueue 0.770215 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75527 ms - Host latency: 2.19399 ms (end to end 3.27375 ms, enqueue 0.791772 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73728 ms - Host latency: 2.18071 ms (end to end 3.23457 ms, enqueue 0.800806 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74785 ms - Host latency: 2.18401 ms (end to end 3.26929 ms, enqueue 0.76062 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73442 ms - Host latency: 2.17095 ms (end to end 3.2355 ms, enqueue 0.797681 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70359 ms - Host latency: 2.14182 ms (end to end 3.19155 ms, enqueue 0.771143 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72373 ms - Host latency: 2.16013 ms (end to end 3.22339 ms, enqueue 0.755078 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75598 ms - Host latency: 2.19402 ms (end to end 3.28696 ms, enqueue 0.796362 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.71399 ms - Host latency: 2.14856 ms (end to end 3.22329 ms, enqueue 0.78064 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70833 ms - Host latency: 2.14067 ms (end to end 3.05164 ms, enqueue 0.75398 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75129 ms - Host latency: 2.18652 ms (end to end 3.27588 ms, enqueue 0.743359 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72737 ms - Host latency: 2.16621 ms (end to end 3.10933 ms, enqueue 0.787988 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74927 ms - Host latency: 2.18555 ms (end to end 3.29324 ms, enqueue 0.764258 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72827 ms - Host latency: 2.16946 ms (end to end 3.23789 ms, enqueue 0.812793 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73057 ms - Host latency: 2.16917 ms (end to end 3.23389 ms, enqueue 0.784375 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.68862 ms - Host latency: 2.12178 ms (end to end 3.19524 ms, enqueue 0.75105 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72349 ms - Host latency: 2.15669 ms (end to end 3.23311 ms, enqueue 0.770264 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75339 ms - Host latency: 2.19128 ms (end to end 3.31284 ms, enqueue 0.782617 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75596 ms - Host latency: 2.19648 ms (end to end 3.28557 ms, enqueue 0.824048 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76411 ms - Host latency: 2.20298 ms (end to end 3.2925 ms, enqueue 0.788916 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.77209 ms - Host latency: 2.20862 ms (end to end 3.3147 ms, enqueue 0.792188 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74829 ms - Host latency: 2.18196 ms (end to end 3.18284 ms, enqueue 0.792749 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76697 ms - Host latency: 2.20659 ms (end to end 3.30898 ms, enqueue 0.791064 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74839 ms - Host latency: 2.18662 ms (end to end 3.25935 ms, enqueue 0.800342 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76477 ms - Host latency: 2.20327 ms (end to end 3.299 ms, enqueue 0.768286 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75398 ms - Host latency: 2.19626 ms (end to end 3.27725 ms, enqueue 0.824731 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74929 ms - Host latency: 2.1887 ms (end to end 3.2613 ms, enqueue 0.794849 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7519 ms - Host latency: 2.18904 ms (end to end 3.27183 ms, enqueue 0.768384 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74189 ms - Host latency: 2.18342 ms (end to end 3.23733 ms, enqueue 0.786353 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73767 ms - Host latency: 2.17776 ms (end to end 3.25283 ms, enqueue 0.788623 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74531 ms - Host latency: 2.18618 ms (end to end 3.25496 ms, enqueue 0.79104 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74377 ms - Host latency: 2.18049 ms (end to end 3.25278 ms, enqueue 0.775903 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72844 ms - Host latency: 2.16897 ms (end to end 3.22505 ms, enqueue 0.804248 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73003 ms - Host latency: 2.17207 ms (end to end 3.22673 ms, enqueue 0.798242 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73367 ms - Host latency: 2.17188 ms (end to end 3.24482 ms, enqueue 0.767603 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7269 ms - Host latency: 2.16604 ms (end to end 3.22195 ms, enqueue 0.804297 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73933 ms - Host latency: 2.17673 ms (end to end 3.26089 ms, enqueue 0.775415 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7394 ms - Host latency: 2.17527 ms (end to end 3.25562 ms, enqueue 0.780859 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.72424 ms - Host latency: 2.15996 ms (end to end 3.24014 ms, enqueue 0.766113 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7238 ms - Host latency: 2.15823 ms (end to end 3.25098 ms, enqueue 0.774194 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74309 ms - Host latency: 2.18008 ms (end to end 3.26423 ms, enqueue 0.783545 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70872 ms - Host latency: 2.1468 ms (end to end 3.20657 ms, enqueue 0.772803 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.6877 ms - Host latency: 2.12146 ms (end to end 3.1748 ms, enqueue 0.771533 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.70217 ms - Host latency: 2.13796 ms (end to end 3.19954 ms, enqueue 0.778247 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.7468 ms - Host latency: 2.18518 ms (end to end 3.27385 ms, enqueue 0.783008 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.74905 ms - Host latency: 2.1905 ms (end to end 3.27144 ms, enqueue 0.78606 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.76587 ms - Host latency: 2.20437 ms (end to end 3.30181 ms, enqueue 0.807739 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.75076 ms - Host latency: 2.18816 ms (end to end 3.26775 ms, enqueue 0.781592 ms)
[11/15/2021-10:38:57] [I] Average on 10 runs - GPU latency: 1.73657 ms - Host latency: 2.17939 ms (end to end 3.23716 ms, enqueue 0.788892 ms)
[11/15/2021-10:38:57] [I]
[11/15/2021-10:38:57] [I] === Performance summary ===
[11/15/2021-10:38:57] [I] Throughput: 560.659 qps
[11/15/2021-10:38:57] [I] Latency: min = 2.0842 ms, max = 2.94159 ms, mean = 2.20648 ms, median = 2.17383 ms, percentile(99%) = 2.90884 ms
[11/15/2021-10:38:57] [I] End-to-End Host Latency: min = 2.12744 ms, max = 4.814 ms, mean = 3.30478 ms, median = 3.24341 ms, percentile(99%) = 4.7103 ms
[11/15/2021-10:38:57] [I] Enqueue Time: min = 0.670349 ms, max = 1.08557 ms, mean = 0.79436 ms, median = 0.784424 ms, percentile(99%) = 0.959076 ms
[11/15/2021-10:38:57] [I] H2D Latency: min = 0.407349 ms, max = 0.458923 ms, mean = 0.429892 ms, median = 0.428467 ms, percentile(99%) = 0.446045 ms
[11/15/2021-10:38:57] [I] GPU Compute Time: min = 1.65759 ms, max = 2.50467 ms, mean = 1.76951 ms, median = 1.73621 ms, percentile(99%) = 2.47987 ms
[11/15/2021-10:38:57] [I] D2H Latency: min = 0.00439453 ms, max = 0.0147705 ms, mean = 0.0070766 ms, median = 0.00689697 ms, percentile(99%) = 0.0098877 ms
[11/15/2021-10:38:57] [I] Total Host Walltime: 3.00539 s
[11/15/2021-10:38:57] [I] Total GPU Compute Time: 2.98163 s
[11/15/2021-10:38:57] [I] Explanations of the performance metrics are printed in the verbose logs.
[11/15/2021-10:38:57] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8003] # ./trtexec --onnx=/root/workspace/onnx/yolov5s-6.0-qat-yolo-op.onnx --workspace=10240 --int8 --saveEngine=/root/yolov5s-6.0-qat-int8.engine --plugins=/root/workspace/plugins/YoloLayer_TRT_v6.0/build/libyolo.so
[11/15/2021-10:38:57] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1013, GPU 1515 (MiB)

My problem is when I infer this engine, it is worked!! but the output is empty. I have tried to use pretrained model and do not fine-tuning model, it is worked,and the inference’s result is correct.

Hi, could you provide a reference or a repo on how you used pytorch-quantization for QAT with YOLOv5?