Converting yolov4 model to engine file in deepstream

edit_or · January 8, 2025, 5:09am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Orin
• DeepStream Version 7.0
**• JetPack Version (valid for Jetson only)**6.0+b87
• TensorRT Version8.6.2.3-1+cuda12.2
• NVIDIA GPU Driver Version (valid for GPU only)
Yolov4 model is trained using TAO lib. The hd5 file is exported to int8 onnx file using the following command.

yolo_v4 export -m /workspace/tao_tutorials/data/experiment_dir_unpruned/resnet18/weights/yolov4_resnet18_epoch_080.hdf5 -k nvidia_tlt -e /workspace/tao_tutorials/notebooks/tao_launcher_starter_kit/yolo_v4/specs/yolo_v4_train_resnet18_kitti.txt --batch_size 3 --data_type int8 --cal_image_dir /workspace/tao_tutorials/data/images --batches 10 --cal_cache_file /workspace/tao_tutorials/data/experiment_dir_unpruned/resnet18/weights/yolov4_resnet18.cal --cal_data_file /workspace/tao_tutorials/data/experiment_dir_unpruned/resnet18/weights/yolov4_resnet18.tensorfile

I have yolov4_resnet18.tensorfile and yolov4_resnet18_epoch_080_int8.onnx
To convert onnx to engine in deepstream, the following config is used.

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
onnx-file=../../models/yolo4_resnet18/yolov4_resnet18_epoch_080_int8.onnx
#int8-calib-file=../../models/yolo4_resnet18/yolov4_resnet18.tensorfile
labelfile-path=../../models/yolo4_resnet18/labels.txt
model-engine-file=../../models/yolo4_resnet18/yolov4_resnet18_epoch_080_int8.onnx_b3_gpu0_int8.engine
infer-dims=3;384;1248
maintain-aspect-ratio=1
uff-input-order=0
uff-input-blob-name=Input
batch-size=3
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=3
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
#no cluster
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=/opt/nvidia/deepstream/deepstream-7.0/sources/apps/sample_apps/deepstream_tlt_apps/post_processor/libnvds_infercustomparser_tlt.so

I suspect that the engine file converted is not int8 because the file name has fp16 like yolov4_resnet18_epoch_080_int8.onnx_b3_gpu0_fp16.engine
I know that int engine file conversion needs calibration file.
But in the config file, if int8-calib-file=../../models/yolo4_resnet18/yolov4_resnet18.tensorfile is uncommented, the app crushed.
What is the correct way of int8 engine conversion for yolo4 model?

yuweiw · January 8, 2025, 6:06am

Could you attach the log from when deepstream generated the engine file?
You can also attach your onnx and cal file. We can try that on our side.

edit_or · January 9, 2025, 2:25am

The model is sent via url in private message. Pls let me know if u can’t download.
I don’t have cal file. Export produced only tensorfile.
Where do I find log file for generation of engine file?

The messages came out during conversion are as follows.

atic@ubuntu:/opt/nvidia/deepstream/deepstream-7.0/sources/apps/sample_apps/st_andrew$ ./deepstream-app -c ../../../../samples/configs/deepstream-app/standrew_main.txt
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
Setting min object dimensions as 16x16 instead of 1x1 to support VIC compute mode.
WARNING: Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-7.0/samples/configs/deepstream-app/../../models/yolo4_resnet18/yolov4_resnet18_epoch_080_int8.onnx_b3_gpu0_int8.engine open error
0:00:07.992059145  4904 0xaaab00757760 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2083> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-7.0/samples/configs/deepstream-app/../../models/yolo4_resnet18/yolov4_resnet18_epoch_080_int8.onnx_b3_gpu0_int8.engine failed
0:00:08.364638463  4904 0xaaab00757760 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2188> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-7.0/samples/configs/deepstream-app/../../models/yolo4_resnet18/yolov4_resnet18_epoch_080_int8.onnx_b3_gpu0_int8.engine failed, try rebuild
0:00:08.370567363  4904 0xaaab00757760 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: onnx2trt_utils.cpp:372: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: builtin_op_importers.cpp:5219: Attribute caffeSemantics not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
WARNING: INT8 calibration file not specified. Trying FP16 mode.
WARNING: [TRT]: DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU
WARNING: [TRT]: TensorRT encountered issues when converting weights between types and that could affect accuracy.
WARNING: [TRT]: If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
WARNING: [TRT]: Check verbose logs for the list of affected weights.
WARNING: [TRT]: - 66 weights are affected by this issue: Detected subnormal FP16 values.
WARNING: [TRT]: - 33 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
0:24:32.996011443  4904 0xaaab00757760 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2141> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.0/samples/models/yolo4_resnet18/yolov4_resnet18_epoch_080_int8.onnx_b3_gpu0_fp16.engine successfully
INFO: [FullDims Engine Info]: layers num: 5
0   INPUT  kFLOAT Input           3x384x1248      min: 1x3x384x1248    opt: 3x3x384x1248    Max: 3x3x384x1248    
1   OUTPUT kINT32 BatchedNMS      1               min: 0               opt: 0               Max: 0               
2   OUTPUT kFLOAT BatchedNMS_1    200x4           min: 0               opt: 0               Max: 0               
3   OUTPUT kFLOAT BatchedNMS_2    200             min: 0               opt: 0               Max: 0               
4   OUTPUT kFLOAT BatchedNMS_3    200             min: 0               opt: 0               Max: 0               

0:24:33.524291723  4904 0xaaab00757760 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-7.0/samples/configs/deepstream-app/config_infer_standrew.txt sucessfully

Runtime commands:
	h: Print this help
	q: Quit

	p: Pause
	r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
      To go back to the tiled display, right-click anywhere on the window.


**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	
**PERF:  0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	
** INFO: <bus_callback:291>: Pipeline ready

Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 279 
NvMMLiteBlockCreate : Block : BlockType = 279 
Opening in BLOCKING MODE 
Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 279 
NvMMLiteOpen : Block : BlockType = 279 
NvMMLiteBlockCreate : Block : BlockType = 279 
NvMMLiteBlockCreate : Block : BlockType = 279 
** INFO: <bus_callback:277>: Pipeline running

yuweiw · January 9, 2025, 3:58am

If you don’t have a cal file, you cannot use the int8 mode. As you can see from your log, when you don’t set the cal file, we changed that to the FP16 mode. But when you set a wrong cal file, this may cause some issues.

WARNING: INT8 calibration file not specified. Trying FP16 mode.

edit_or · January 9, 2025, 4:11am

So how can I export to produce cal file?
You can see in earlier post for producing onnx file for int8 using TAO. The command doesn’t produce cal file.

yuweiw · January 9, 2025, 5:14am

We suggest that you file this topic on the TAO Forum. You can get a more professional response about how generate a cal file. And then discuss Deepstream-related issues after cal file can be generated normally here.

system · January 23, 2025, 5:14am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using a onnx model in INT8 mode for jetson Orin AGX TAO Toolkit yolo , onnx , jetson , deepstream	15	987	May 21, 2024
Unable to generate tensorrt engine using ds-tao-detection app for yolov4_tiny for QAT trained etlt model DeepStream SDK	16	563	June 14, 2023
INT8 Calibration with DS 6.3 worse than with DS 6.0 DeepStream SDK tensorrt , jetson , deepstream , tensorrt-model-optimizer	20	85	March 10, 2025
Instructions to integrate TAO 3.0 YoloV4 model into DeepStream produce no output on Jetson NX DeepStream SDK	10	394	December 5, 2023
Generate engine using onnx2trt, The engine was used to call deepstream, but an error was reported DeepStream SDK onnx	5	218	June 11, 2024
Deepstream infrence gives no detection TAO Toolkit	28	1934	December 9, 2021
Yolov9 Model Engine Build Failure with nvinfer DeepStream SDK tensorrt , nvbugs , deepstream	17	136	February 14, 2025
How to evaluate .engine model on custom dataset? DeepStream SDK	12	1034	May 24, 2023
Tutorial: How to run YOLOv7 on Deepstream DeepStream SDK demos-and-tutorials	19	6916	March 26, 2024
Yolov8seg giving divide by 0 errors if no detection in frame DeepStream SDK	11	771	November 7, 2023

Converting yolov4 model to engine file in deepstream

Related topics