Inference YOLO_v4 int8 mode doesn't show any bounding box

• Hardware: RTX 2080Ti
• Network Type: Yolo_v4
• TLT Version: v3.21.08-py3
• Training spec file: yolo_v4_train_resnet18_kitti.txt (2.3 KB)
yolo_v4_retrain_resnet18_kitti.txt (2.3 KB)

Hi everyone, I followed YOLOv4 — TAO Toolkit 3.0 documentation to retrain my custom model. When I export and run inference my model, it doesn’t show any bbox. However, when I run with fp16, it works successfully.

Here are my command lines in the Jupyter script:
Export:

!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt  \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    -k $KEY \
                    --cal_image_dir  $USER_EXPERIMENT_DIR/data/training/image_2 \
                    --data_type int8 \
                    --batch_size 8 \
                    --batches 10 \
                    --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin  \
                    --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
                    --verbose \
                    --gen_ds_config

Convert:

tao converter -k $KEY  \
                   -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \
                   -c $USER_EXPERIMENT_DIR/export/cal.bin \
                   -e $USER_EXPERIMENT_DIR/export/trt.engine \
                   -b 2 \
                   -o BatchedNMS \
                   -m 8 \
                   -t int8 \
                   $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt

Inference:

!tao yolo_v4 inference -m $USER_EXPERIMENT_DIR/export/trt.engine \
                       -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                       -i $DATA_DOWNLOAD_DIR/test_samples \
                       -o $USER_EXPERIMENT_DIR/yolo_infer_images \
                       -t 0.6

I also followed @Morganh in Error when running custom YOLOv4 on deepstream_python_apps - #9 by thuan169993 for adjust the annotation format but I still get the same issue.

Could you guys help me with this problem? If you guys need anything else, please just let me know. Thanks in advance!

Hi,
To narrow down, could you please try with KITTI dataset?

Yeah. I’ll try it.

Hi @Morganh, here is my update. I trained my YOLOv4 with a part of KITTI dataset and I still have the same problem. Could you guys help me please?

See TLT YOLOv4 (CSPDakrnet53) - TensorRT INT8 model gives wrong predictions (0 mAP) - #23 by Morganh , actually I cannot reproduce the int8 issue for CSPDarknet53 backbone.
Also the user in that topic has no issue for yolo_v4 int8 with resnet18 backbone.

Please check if my test step is useful.

Hi @thuan169993, I had a similar problem with INT8 mode, and it was related to the calibration file (cal.bin) generated when you exported the mode with tao yolo_v4 export

Hi @virsg, did you fix it ?

@thuan169993
Which version of Tensorrt is it?
Can you share $ dpkg -l |grep cuda

Sorry for late response. I use 7.2.3
Here is the log inside my container

May I know what’s your container? Is it TLT/TAO?

I use nvcr.io/nvidia/deepstream:5.1-21.02-samples and then follow GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream to install environment to run yolov4 inside docker container.

And the image above is the $ dpkg -l |grep cuda log inside that container

OK. To narrow down, please not use Deepstream container.
Please trigger TAO docker directly, then export and generate cal.bin file. Then generate trt engine and run inference with it.

I use tao docker, but still get no box. I use the jupyter notebook in cv_samples_vv1.2.0, uncomment the int8 export/convert clause and execute. FP32 works.

Do you solve your problem? How?

Simply using TAO docker directly does not help. Change batch size from 8 to 1, both in tao converter command line and in the eval_config section of specs/yolo_v4_retrain_resnet18_kitti.txt, did the trick.

But I don’t know why these changes matter. Could you please shed some light on this?

tao converter … -p Input,1x3x384x1248,1x3x384x1248,1x3x384x1248
eval_config {
batch_size: 1

}

@renlifeng
Can you share your latest spec file? Thanks.

Sure. But I don’t known how to upload the file as an attachment, even after reading Attaching Files to Forum Topics/Posts.

Sorry for pasting the file here.

— begin
random_seed: 42
yolov4_config {
big_anchor_shape: “[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]”
mid_anchor_shape: “[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]”
small_anchor_shape: “[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]”
box_matching_iou: 0.25
matching_neutral_box_iou: 0.5
arch: “resnet”
nlayers: 18
arch_conv_blocks: 2
loss_loc_weight: 0.8
loss_neg_obj_weights: 100.0
loss_class_weights: 0.5
label_smoothing: 0.0
big_grid_xy_extend: 0.05
mid_grid_xy_extend: 0.1
small_grid_xy_extend: 0.2
freeze_bn: false
#freeze_blocks: 0
force_relu: false
}
training_config {
batch_size_per_gpu: 8
num_epochs: 80
enable_qat: false
checkpoint_interval: 10
learning_rate {
soft_start_cosine_annealing_schedule {
min_learning_rate: 1e-7
max_learning_rate: 1e-4
soft_start: 0.3
}
}
regularizer {
type: NO_REG
weight: 3e-9
}
optimizer {
adam {
epsilon: 1e-7
beta1: 0.9
beta2: 0.999
amsgrad: false
}
}
pruned_model_path: “/workspace/tao-experiments/yolo_v4/experiment_dir_pruned/yolov4_resnet18_pruned.tlt”
}
eval_config {
average_precision_mode: SAMPLE
batch_size: 1
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.001
clustering_iou_threshold: 0.5
top_k: 200
force_on_cpu: true
}
augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 1248
output_height: 384
output_channel: 3
randomize_input_shape_period: 0
mosaic_prob: 0.5
mosaic_min_ratio:0.2
}
dataset_config {
data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/training/tfrecords/train*”
image_directory_path: “/workspace/tao-experiments/data/training”
}
include_difficult_in_training: true
image_extension: “png”
target_class_mapping {
key: “car”
value: “car”
}
target_class_mapping {
key: “pedestrian”
value: “pedestrian”
}
target_class_mapping {
key: “cyclist”
value: “cyclist”
}
target_class_mapping {
key: “van”
value: “car”
}
target_class_mapping {
key: “person_sitting”
value: “pedestrian”
}
validation_data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/val/tfrecords/val*”
image_directory_path: “/workspace/tao-experiments/data/val”
}
}
— end

For uploading the file, please click “upload” button when you reply your comments.

In the spec, there are training bs and eval bs.
training_config {
batch_size_per_gpu: 8

eval_config {
average_precision_mode: SAMPLE
batch_size: 1

Do you mean when you change eval bs from 8 to 1, then there is no issue now. Am I correct?

Thanks for the tip.

Yes. I only changed the eval bs, and only did that after the training is done.

I also change min/opt/max shape to 1x3x384x1248 when convert the engine.

Thanks for the info.

I am really sorry. In fact I never make int8 work. Changing batch size, input size have no effect on this.

I tried to tweak the command line for many time and got myself confused. By mistake, I specified fp16 type but put the engine file in int8 directory and thought that was a int8 engine. The boxes was actually inferenced by fp16 engine.

Sorry.