Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc) Jetson
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Yolo_v4
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
toolkit version : 5.2.0
• Training spec file(If have, please share here)
yolo_v4_train_resnet18.txt (3.3 KB)
Here is my situation, I trained a yolov4 model using the tao-getting-started notebook (version 5.0.0). The training went well and I exported the onnx file.
My application run on a Jetson Orin AGX using deepstream sdk to make real time inference on a 1080p stream.
I have an nput size of 1280x736 for the model, but when I run the model on the jetson it seems to struggle to keep up (GPU usage is at 100% and the output video stream have a very bad quality), problem that I don’t have with other models I’ve trained with smaller input size. Since currently i’m running the model in fp32 mode, I want to see if I can have an improvement in int8 mode
So here are some questions that I have :
1 - Is it normal that the Jetson struggle with running this model ? It seems to run fine on my DGPU (RTX 3070)
2- Following the notebook, After the training, I used this code to export the model :
# tao <task> export will fail if .onnx already exists. So we clear the export folder before tao <task> export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
# Generate .onnx file using tao container
!tao model yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.hdf5 \
-o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.onnx \
-e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
--target_opset 12 \
--gen_ds_config
and then this code to generate the calibration file for using in int8 mode
# To export in INT8 mode (generate calibration cache file).
!tao deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.onnx \
-e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
--cal_image_dir $DATA_DOWNLOAD_DIR/training/image_2 \
--data_type int8 \
--batch_size 8 \
--batches 100 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \
--cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
--engine_file $USER_EXPERIMENT_DIR/export/trt.engine.int8 \
--results_dir $USER_EXPERIMENT_DIR/export
Is this correct ?
3 - How do I use it in deepstream ? All exemples that i’ve seen use a .etlt model and a .txt calibration file, but I have a .onnx model, and the following files : cal.bin, cal.tensorfile, trt.engine.int8.
I know that the engine file is supposed to be generated on the platform that will make the inference (for exemple I used the TrafficCamNet from the Nvidia model zoo with the .etlt and int8 calibration files to generate the engine file on my jetson and it works fine)
Any help would be welcomed, thank you