Inference YOLO_v4 int8 mode doesn't show any bounding box

thuan169993 · October 18, 2021, 2:58am

• Hardware: RTX 2080Ti
• Network Type: Yolo_v4
• TLT Version: v3.21.08-py3
• Training spec file: yolo_v4_train_resnet18_kitti.txt (2.3 KB)
yolo_v4_retrain_resnet18_kitti.txt (2.3 KB)

Hi everyone, I followed YOLOv4 — TAO Toolkit 3.22.05 documentation to retrain my custom model. When I export and run inference my model, it doesn’t show any bbox. However, when I run with fp16, it works successfully.

Here are my command lines in the Jupyter script:
Export:

!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt  \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    -k $KEY \
                    --cal_image_dir  $USER_EXPERIMENT_DIR/data/training/image_2 \
                    --data_type int8 \
                    --batch_size 8 \
                    --batches 10 \
                    --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin  \
                    --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
                    --verbose \
                    --gen_ds_config

Convert:

tao converter -k $KEY  \
                   -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \
                   -c $USER_EXPERIMENT_DIR/export/cal.bin \
                   -e $USER_EXPERIMENT_DIR/export/trt.engine \
                   -b 2 \
                   -o BatchedNMS \
                   -m 8 \
                   -t int8 \
                   $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt

Inference:

!tao yolo_v4 inference -m $USER_EXPERIMENT_DIR/export/trt.engine \
                       -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                       -i $DATA_DOWNLOAD_DIR/test_samples \
                       -o $USER_EXPERIMENT_DIR/yolo_infer_images \
                       -t 0.6

I also followed @Morganh in Error when running custom YOLOv4 on deepstream_python_apps - #9 by thuan169993 for adjust the annotation format but I still get the same issue.

Could you guys help me with this problem? If you guys need anything else, please just let me know. Thanks in advance!

Morganh · October 18, 2021, 7:43am

Hi,
To narrow down, could you please try with KITTI dataset?

thuan169993 · October 18, 2021, 8:15am

Yeah. I’ll try it.

thuan169993 · October 19, 2021, 8:44am

Hi @Morganh, here is my update. I trained my YOLOv4 with a part of KITTI dataset and I still have the same problem. Could you guys help me please?

Morganh · October 19, 2021, 9:17am

See TLT YOLOv4 (CSPDakrnet53) - TensorRT INT8 model gives wrong predictions (0 mAP) - #23 by Morganh , actually I cannot reproduce the int8 issue for CSPDarknet53 backbone.
Also the user in that topic has no issue for yolo_v4 int8 with resnet18 backbone.

Please check if my test step is useful.

virsg · October 20, 2021, 2:15am

Hi @thuan169993, I had a similar problem with INT8 mode, and it was related to the calibration file (cal.bin) generated when you exported the mode with tao yolo_v4 export

thuan169993 · October 20, 2021, 2:18am

Hi @virsg, did you fix it ?

Morganh · October 25, 2021, 3:25am

@thuan169993
Which version of Tensorrt is it?
Can you share $ dpkg -l |grep cuda

thuan169993 · October 29, 2021, 7:34am

Sorry for late response. I use 7.2.3
Here is the log inside my container

Morganh · November 1, 2021, 2:17pm

May I know what’s your container? Is it TLT/TAO?

thuan169993 · November 2, 2021, 1:30am

I use nvcr.io/nvidia/deepstream:5.1-21.02-samples and then follow GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream to install environment to run yolov4 inside docker container.

And the image above is the $ dpkg -l |grep cuda log inside that container

Morganh · November 2, 2021, 7:32am

OK. To narrow down, please not use Deepstream container.
Please trigger TAO docker directly, then export and generate cal.bin file. Then generate trt engine and run inference with it.

renlifeng · November 10, 2021, 1:35am

I use tao docker, but still get no box. I use the jupyter notebook in cv_samples_vv1.2.0, uncomment the int8 export/convert clause and execute. FP32 works.

Do you solve your problem? How?

renlifeng · November 11, 2021, 1:35am

Simply using TAO docker directly does not help. Change batch size from 8 to 1, both in tao converter command line and in the eval_config section of specs/yolo_v4_retrain_resnet18_kitti.txt, did the trick.

But I don’t know why these changes matter. Could you please shed some light on this?

tao converter … -p Input,1x3x384x1248,1x3x384x1248,1x3x384x1248
eval_config {
batch_size: 1
…
}

Morganh · November 11, 2021, 1:47am

@renlifeng
Can you share your latest spec file? Thanks.

renlifeng · November 11, 2021, 2:32am

Sure. But I don’t known how to upload the file as an attachment, even after reading Attaching Files to Forum Topics/Posts.

Sorry for pasting the file here.

— begin
random_seed: 42
yolov4_config {
big_anchor_shape: “[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]”
mid_anchor_shape: “[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]”
small_anchor_shape: “[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]”
box_matching_iou: 0.25
matching_neutral_box_iou: 0.5
arch: “resnet”
nlayers: 18
arch_conv_blocks: 2
loss_loc_weight: 0.8
loss_neg_obj_weights: 100.0
loss_class_weights: 0.5
label_smoothing: 0.0
big_grid_xy_extend: 0.05
mid_grid_xy_extend: 0.1
small_grid_xy_extend: 0.2
freeze_bn: false
#freeze_blocks: 0
force_relu: false
}
training_config {
batch_size_per_gpu: 8
num_epochs: 80
enable_qat: false
checkpoint_interval: 10
learning_rate {
soft_start_cosine_annealing_schedule {
min_learning_rate: 1e-7
max_learning_rate: 1e-4
soft_start: 0.3
}
}
regularizer {
type: NO_REG
weight: 3e-9
}
optimizer {
adam {
epsilon: 1e-7
beta1: 0.9
beta2: 0.999
amsgrad: false
}
}
pruned_model_path: “/workspace/tao-experiments/yolo_v4/experiment_dir_pruned/yolov4_resnet18_pruned.tlt”
}
eval_config {
average_precision_mode: SAMPLE
batch_size: 1
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.001
clustering_iou_threshold: 0.5
top_k: 200
force_on_cpu: true
}
augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 1248
output_height: 384
output_channel: 3
randomize_input_shape_period: 0
mosaic_prob: 0.5
mosaic_min_ratio:0.2
}
dataset_config {
data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/training/tfrecords/train*”
image_directory_path: “/workspace/tao-experiments/data/training”
}
include_difficult_in_training: true
image_extension: “png”
target_class_mapping {
key: “car”
value: “car”
}
target_class_mapping {
key: “pedestrian”
value: “pedestrian”
}
target_class_mapping {
key: “cyclist”
value: “cyclist”
}
target_class_mapping {
key: “van”
value: “car”
}
target_class_mapping {
key: “person_sitting”
value: “pedestrian”
}
validation_data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/val/tfrecords/val*”
image_directory_path: “/workspace/tao-experiments/data/val”
}
}
— end

Morganh · November 11, 2021, 2:49am

For uploading the file, please click “upload” button when you reply your comments.

In the spec, there are training bs and eval bs.
training_config {
batch_size_per_gpu: 8

eval_config {
average_precision_mode: SAMPLE
batch_size: 1

Do you mean when you change eval bs from 8 to 1, then there is no issue now. Am I correct?

renlifeng · November 11, 2021, 6:46am

Thanks for the tip.

Yes. I only changed the eval bs, and only did that after the training is done.

I also change min/opt/max shape to 1x3x384x1248 when convert the engine.

Morganh · November 11, 2021, 6:53am

Thanks for the info.

renlifeng · November 11, 2021, 8:36am

I am really sorry. In fact I never make int8 work. Changing batch size, input size have no effect on this.

I tried to tweak the command line for many time and got myself confused. By mistake, I specified fp16 type but put the engine file in int8 directory and thought that was a int8 engine. The boxes was actually inferenced by fp16 engine.

Sorry.

Topic		Replies	Views
TLT YOLOv3 Int8 can not detect anything TAO Toolkit	17	1691	October 12, 2021
Unable to export QAT yolov3 in int8 TAO Toolkit	7	552	April 25, 2023
Convert TAO Yolov4 model to DLA engine fails TAO Toolkit	22	1668	March 1, 2022
Unable to deploy TAO 4.0.1 yolov4 model on deepstream6.0 TAO Toolkit deepstream	43	1082	August 18, 2023
Yolov3 worklfow or incorrect calibration file for int8 inference TAO Toolkit tensorrt , yolo , deepstream	6	528	July 6, 2023
Tao pre-trained yolo4tiny - AssertionError: Must have more boxes than clusters TAO Toolkit	54	2277	January 21, 2022
TLT YOLOv4 (CSPDakrnet53) - TensorRT INT8 model gives wrong predictions (0 mAP) TAO Toolkit yolo	35	3827	December 6, 2021
Error in Generating TFrecords for yolov4 TAO Toolkit	38	1227	May 17, 2022
Using a onnx model in INT8 mode for jetson Orin AGX TAO Toolkit yolo , onnx , jetson , deepstream	15	958	May 21, 2024
Yolo-v4 on colab - ModuleNotFound - No module named 'uff' TAO Toolkit tao	18	444	March 14, 2024

Inference YOLO_v4 int8 mode doesn't show any bounding box

Related topics