Please provide complete information as applicable to your setup.
**• Hardware Platform (Jetson / GPU) - Jetson Xavier NX
**• DeepStream Version - 6.3
**• JetPack Version (valid for Jetson only) - 5.1.2
**• TensorRT Version - 8.5.2
• NVIDIA GPU Driver Version (valid for GPU only)
**• Issue Type( questions, new requirements, bugs) - Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
**• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description) - deepstream-test5-app
Hi,
1- I followed this repo: GitHub - WongKinYiu/yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors and trained custom yolov7-tiny model.
- Train command: python train.py --workers 8 --device 0 --batch-size 16 --data data/custom.yaml --img 640 480 --cfg cfg/training/yolov7-tiny.yaml --weights ‘yolov7-tiny.pt’ --name yolov7-tiny-custom --hyp data/hyp.scratch.tiny.yaml
2- I reparameterized the custom model as recommended by the repository.
3- I followed this repo: yolo_deepstream/yolov7_qat at main · NVIDIA-AI-IOT/yolo_deepstream · GitHub
I converted the qat model to int8 model with the following commands:
- sudo /usr/src/tensorrt/bin/trtexec --onnx=qat.onnx --int8 --saveEngine=yolov7_tiny_qat_3.engine --workspace=1024000
- sudo /usr/src/tensorrt/bin/trtexec --onnx=qat.onnx --int8 --fp16 --saveEngine=yolov7_tiny_qat_2.engine --workspace=1024000 --minShapes=images:4x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:4x3x640x640
- sudo /usr/src/tensorrt/bin/trtexec --onnx=qat.onnx --int8 --saveEngine=yolov7_tiny_qat.engine --workspace=1024000 --minShapes=images:4x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:4x3x640x640
The engine file was created successfully every time.
When i check the model with this: sudo /usr/src/tensorrt/bin/trtexec --loadEngine=/opt/nvidia/deepstream/deepstream-6.3/sources/objectDetector_Yolo/yolov7/yolov7_tiny_qat.engine --plugins=./libmyplugins.so i got this:
[07/05/2024-14:58:30] [I] === Performance summary ===
[07/05/2024-14:58:30] [I] Throughput: 34.7856 qps
[07/05/2024-14:58:30] [I] Latency: min = 24.8691 ms, max = 42.3534 ms, mean = 28.7271 ms, median = 25.0518 ms, percentile(90%) = 39.1527 ms, percentile(95%) = 39.1918 ms, percentile(99%) = 39.2491 ms
[07/05/2024-14:58:30] [I] Enqueue Time: min = 3.00171 ms, max = 5.73328 ms, mean = 3.9051 ms, median = 3.82019 ms, percentile(90%) = 4.84863 ms, percentile(95%) = 5.13629 ms, percentile(99%) = 5.36453 ms
[07/05/2024-14:58:30] [I] H2D Latency: min = 0.703857 ms, max = 1.16614 ms, mean = 0.820026 ms, median = 0.720337 ms, percentile(90%) = 1.16443 ms, percentile(95%) = 1.16507 ms, percentile(99%) = 1.16565 ms
[07/05/2024-14:58:30] [I] GPU Compute Time: min = 24.0393 ms, max = 40.9948 ms, mean = 27.7691 ms, median = 24.2098 ms, percentile(90%) = 37.7932 ms, percentile(95%) = 37.8299 ms, percentile(99%) = 37.8911 ms
[07/05/2024-14:58:30] [I] D2H Latency: min = 0.11731 ms, max = 0.19989 ms, mean = 0.137928 ms, median = 0.123291 ms, percentile(90%) = 0.195007 ms, percentile(95%) = 0.196533 ms, percentile(99%) = 0.198792 ms
[07/05/2024-14:58:30] [I] Total Host Walltime: 3.07598 s
[07/05/2024-14:58:30] [I] Total GPU Compute Time: 2.97129 s
[07/05/2024-14:58:30] [W] * GPU compute time is unstable, with coefficient of variance = 19.9549%.
[07/05/2024-14:58:30] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
[07/05/2024-14:58:30] [I] Explanations of the performance metrics are printed in the verbose logs.
[07/05/2024-14:58:30] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --loadEngine=/opt/nvidia/deepstream/deepstream-6.3/sources/objectDetector_Yolo/yolov7/yolov7_tiny_qat.engine --plugins=./libmyplugins.so
4- When I want to test the created engine file with the deepstream-test5-app application, I get the following error:
Unknown or legacy key specified ‘is-classifier’ for group [property]
Unknown or legacy key specified ‘disable-output-host-copy’ for group [property]
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:08.013987301 8575 0xaaaafff84580 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1988> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.3/sources/objectDetector_Yolo/yolov7/yolov7_tiny_qat_3.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: [Implicit Engine Info]: layers num: 2
0 INPUT kFLOAT images 3x640x640
1 OUTPUT kFLOAT outputs 25200x7
0:00:08.093343861 8575 0xaaaafff84580 WARN nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1920> [UID = 1]: Backend has maxBatchSize 1 whereas 16 has been requested
0:00:08.093459190 8575 0xaaaafff84580 WARN nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2097> [UID = 1]: deserialized backend context :/opt/nvidia/deepstream/deepstream-6.3/sources/objectDetector_Yolo/yolov7/yolov7_tiny_qat_3.engine failed to match config params, trying rebuild
0:00:08.122447330 8575 0xaaaafff84580 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: Trying to create engine from model files
ERROR: failed to build network since there is no model file matched.
ERROR: failed to build network.
0:00:09.446987646 8575 0xaaaafff84580 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2022> [UID = 1]: build engine file failed
0:00:09.520248994 8575 0xaaaafff84580 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2108> [UID = 1]: build backend context failed
0:00:09.520404612 8575 0xaaaafff84580 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1282> [UID = 1]: generate backend failed, check config file settings
0:00:09.520527044 8575 0xaaaafff84580 WARN nvinfer gstnvinfer.cpp:898:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:09.520577637 8575 0xaaaafff84580 WARN nvinfer gstnvinfer.cpp:898:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-6.3/sources/objectDetector_Yolo/yolov7/config_infer_primary_yoloV7_tiny.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
[NvMultiObjectTracker] De-initialized
** ERROR: main:1534: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(898): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-6.3/sources/objectDetector_Yolo/yolov7/config_infer_primary_yoloV7_tiny.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed
(.pt → .onnx → fp16(using deepstream) works without any problems.)
config_infer_primary_yoloV7_tiny.txt (4.1 KB)
I can’t understand why the int8 model doesn’t work.
Thanks for help.