INT8 Calibration with DS 6.3 worse than with DS 6.0

gaylord · November 4, 2024, 9:32am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Orin NX (8GB)

• DeepStream Version 6.3

• JetPack Version (valid for Jetson only) 5.1.2

• TensorRT Version JP 5.1.2 Standard (8,5?)

• NVIDIA GPU Driver Version (valid for GPU only) JP 5.1.2 standard

• Issue Type( questions, new requirements, bugs) Question

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

Trying to create and INT8 optimized TRT engine from my YOLOv4 ONNX model for use with DeepStream-YOLO / nvinfer. I am using the same model file and the same Calibration Images as from Deepstream 6.0.0. I am running the int8 calibration inside Deepstream_app as described here: DeepStream-Yolo/docs/YOLOv5.md at master · marcoslucianops/DeepStream-Yolo · GitHub (without the Ultralytics part at the beginning)

With DS 6.0 on a Jeston NX (not Orin) this resulted in a model with 94% recall over my test dataset. With DS 6.3 and Orin NX i only get 87% (tested many, many calibration runs with different parameters).

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

config_nvinfer_primary.txt

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
custom-network-config=yolov4.cfg
model-file=yolov4.weights
model-engine-file=model_b1_gpu0_int8.engine
int8-calib-file=calib.table
labelfile-path=dc_vehicles.training.names
batch-size=1
network-mode=1
num-detected-classes=14
interval=0
gie-unique-id=1
process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=0
symmetric-padding=1
force-implicit-batch-dim=0
workspace-size=6000
#parse-bbox-func-name=NvDsInferParseYolo
parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
# DLA does not make sense for us with JP 5.1.2
# because our model requires a lot of GPU fallbacks, reducing speed by 40%.
#enable-dla=1
#use-dla-core=0
#gpu-fallback=1

[class-attrs-all]
nms-iou-threshold=0.3
pre-cluster-threshold=0.3
topk=300

Is this something others also experience? This reduction in detection rates makes it a problem for us to upgrade to Jetpack 5.

Fiona.Chen · November 4, 2024, 10:26am

DeepStream 6.0.1 is based on TensorRT 8.0.1, DeepStream 6.3 is based on TensorRT 8.5.2.2.
The calibration file generated with TensorRT8.0.1 can’t be used with TensorRT 8.5.2.

Please make sure your calibration file are generated with the same TensorRT version you are using. For calibration file generating method, you may refer to NVIDIA-AI-IOT/yolo_deepstream: yolo model qat and deploy with deepstream&tensorrt

gaylord · November 4, 2024, 3:28pm

This is exactly what i am doing: I am using my model to generate a new int8 calibration for the new Tensorrt Version on the new device. I am using GitHub - marcoslucianops/DeepStream-Yolo: NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models however. Still the resulting model is not as good as the int8-calibrated model on the old version.

Fiona.Chen · November 5, 2024, 12:56am

Can you try the QAT method in yolo_deepstream/yolov7_qat at main · NVIDIA-AI-IOT/yolo_deepstream?

gaylord · November 5, 2024, 7:51am

I tried GitHub - NVIDIA-AI-IOT/yolo_deepstream: yolo model qat and deploy with deepstream&tensorrt but i get an error when trying to use the onnx version of my model to generate a trt engine:
WARNING: [TRT]: onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: INT8 calibration file not specified. Trying FP16 mode.

gaylord · November 5, 2024, 7:53am

The yolo_deepstream/yolov7_qat at main · NVIDIA-AI-IOT/yolo_deepstream · GitHub works with yolov7 pt files. I am using Yolov5/darknet. Is it possible to use cfg and . weights files with this, too?

haowang · November 5, 2024, 8:56am

which tensorrt version are you using? and please show me the convert command.

haowang · November 5, 2024, 8:58am

Please use this repo as your code base: GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Which is based on pytorch. Darknet is not friendly for QAT training

gaylord · November 5, 2024, 10:12am

Tensorrt Version is 8.5.2.2-1+cuda11.4 (JP 5.1.2)
i tried with the instructions given in the DeepStream-Yolo repository (there it works but the result is worse than with JP 4.6.2/NX before) and with the repo at GitHub - NVIDIA-AI-IOT/yolo_deepstream: yolo model qat and deploy with deepstream&tensorrt. With this one i could not make an INT8 conversion as described in earlier post. it will go to FP16 instead. Will now try using Pytorch and GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite As far as i understant this, the engine needs to be prepared on the device it is used on, right? This would be a Jeston Orin NX in my case.

gaylord · November 5, 2024, 5:15pm

So i converted my yolov4 model to pt format and tried loading and exporting it like this:

from ultralytics import YOLO

# Load a YOLO11n PyTorch model
model = YOLO("../yolov4.pt")

# Export the model to TensorRT
model.export(format="engine")  # creates 'yolo11n.engine'

# Load the exported TensorRT model
trt_model = YOLO("yolov4.engine")

# Run inference
results = trt_model("https://ultralytics.com/images/bus.jpg")

It runs for a moment, then bails out with:

Traceback (most recent call last):
  File "convert.py", line 4, in <module>
    model = YOLO("../yolov4.pt")
  File "/home/nvidia/.local/lib/python3.8/site-packages/ultralytics/models/yolo/model.py", line 23, in __init__
    super().__init__(model=model, task=task, verbose=verbose)
  File "/home/nvidia/.local/lib/python3.8/site-packages/ultralytics/engine/model.py", line 145, in __init__
    self._load(model, task=task)
  File "/home/nvidia/.local/lib/python3.8/site-packages/ultralytics/engine/model.py", line 285, in _load
    self.model, self.ckpt = attempt_load_one_weight(weights)
  File "/home/nvidia/.local/lib/python3.8/site-packages/ultralytics/nn/tasks.py", line 912, in attempt_load_one_weight
    model = (ckpt.get("ema") or ckpt["model"]).to(device).float()  # FP32 model
KeyError: 'model'

haowang · November 12, 2024, 9:20am

hi, As far as I know, You can not directly load your yolov4.pt via ultralytics.YOLO because the offical yolov4 is tranied via darknet but the ultralytics.YOLO do not support it, You can find some help in GitHub - ultralytics/yolov5: YOLOv5’ 's github issue page.

Back to your original question. You meet a accuracy regression in yolov4’s onnx while deploying it to deepstream between DS6.0 and DS6.3.
my question is:

did you got the onnx model?
can you show your command how you got the model_b1_gpu0_int8.engine?(did you got it via trtexec?)

gaylord · November 12, 2024, 10:29am

As i said in my original post: I was using this repo: DeepStream-Yolo/docs/YOLOv5.md at master · marcoslucianops/DeepStream-Yolo · GitHub

We have been using this plugin in DS 6.0 on Jetson NX and now we use the same in DS 6.3 on Jetson Orin NX.

It contains a wrapper for the nvinfer plugin from tensorrt. When you start the pipeline with DeepStream-App and there is no int8 enigne, it will create one. It has the option to load an ONNX model or to directly load a YOLOv4 model.

When it creates the int8 model, it usses our validation images from the training to make int8 calibration.

We do exactly the same steps in DS6.0 and DS6.3. But in DS 6.0 on Jetson NX und JP4.6.2 if a start it about 3 times and use the best engine, i get a much better result than on DS 6.3 on Orin and JP5.1.2.

I used Ultralytics only because you told me to. For that i converted it first from darknet to .pt and then tried to load it with Ultralytics to do the conversion. But this did not work.

haowang · November 12, 2024, 12:38pm

Sorry I am not familiar with DeepStream-Yolo/docs/YOLOv5.md at master · marcoslucianops/DeepStream-Yolo · GitHub as it is not a nvidia owned repo. So, can you got the onnx and the calibration file? If you can, I can show you some tensorrt command to help you got almost the same accuraccy cross multiple DS version

gaylord · November 12, 2024, 12:45pm

ONNX export was done on Colab with:


from ultralytics import YOLO

# Load your trained YOLOv8 model
model = YOLO('/content/runs/detect/train/weights/best.pt')

# Export the model to ONNX format with INT8 quantization
#exported_model_path = model.export(format='onnx', dynamic=True, simplify=True, opset=17, int8=True, data='/content/data.yaml')
exported_model_path = model.export(
    format='onnx',
    imgsz=(640, 640),  # Set your desired input size here, e.g., (640, 640)
    simplify=True,
    opset=17,
    data='/content/data.yaml',
    half=True
)

int8 calibration was done by starting the DS pipeline from DeepStream-Yolo with the calibration config above. This will use tensorrt to perform the int8 calibration.
The ouput showed all images were used. It produce a calibration table and the int8 engine.

haowang · November 12, 2024, 1:19pm

U can use the
trtexec —onnx=yolov4.onnx —int8 —fp16 —calib=calibrationfilegotfromds60.calib —saveEngine=yolov4.engine
Then use engine-file to start deepstream. The same calibration file and same onnx should get the same accuracy

gaylord · November 12, 2024, 1:22pm

where would i get this from?

gaylord · November 12, 2024, 1:23pm

ah. you mean the calib.table generated by the DS 6 calibration?
interesting idea.

gaylord · November 12, 2024, 2:14pm

seems like calibration.table from DS 6.0 cannot be used in DS 6.3:

Building the TensorRT Engine

ERROR: [TRT]: 4: [standardEngineBuilder.cpp::initCalibrationParams::1460] Error Code 4: Internal Error (Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.)
Building engine failed

haowang · November 13, 2024, 3:17am

can you upload your onnx model and calibrationfile which can reproduce the error?
I will check in my side.

Fiona.Chen · February 24, 2025, 8:04am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Topic		Replies	Views
Get wrong infer results while testing yolov4 on deepstream 5.0 DeepStream SDK	46	9390	October 12, 2021
Converting yolov4 model to engine file in deepstream DeepStream SDK jetson , deepstream	6	28	January 9, 2025
YOLOv3 TensorRT model in DeepStream DeepStream SDK	11	1089	October 12, 2021
TensorRT INT8 calibration python API Jetson AGX Orin tensorrt , jetson-inference , python , calibration	28	5540	October 26, 2022
Unable to generate tensorrt engine using ds-tao-detection app for yolov4_tiny for QAT trained etlt model DeepStream SDK	16	567	June 14, 2023
Regarding the use of YOLOV8S model combined with JPS Docker Metropolis Microservices for Jetson jetson , deepstream	15	55	December 31, 2024
Unable to build model engine for INT8 yolov8m quantized using tensorrt model optimizer TensorRT jetson , deepstream	5	422	September 24, 2024
Using a onnx model in INT8 mode for jetson Orin AGX TAO Toolkit yolo , onnx , jetson , deepstream	15	1027	May 21, 2024
Issue with yolov8s-seg conversion from onnx to engine Jetson AGX Orin yolo , onnx	5	50	October 29, 2024
How to evaluate .engine model on custom dataset? DeepStream SDK	12	1061	May 24, 2023

INT8 Calibration with DS 6.3 worse than with DS 6.0

Related topics