Yolov3 worklfow or incorrect calibration file for int8 inference

• Hardware (T4/V100/Xavier/Nano/etc) I train on rtx 4090 and I do inference on AGX Xavier
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Yolov3
random_seed: 42
yolov3_config {
big_anchor_shape: “[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]”
mid_anchor_shape: “[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]”
small_anchor_shape: “[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]”
matching_neutral_box_iou: 0.7
arch: “resnet”
nlayers: 18
arch_conv_blocks: 2
loss_loc_weight: 0.8
loss_neg_obj_weights: 100.0
loss_class_weights: 1.0
freeze_bn: false
#freeze_blocks: 0
force_relu: false
training_config {
batch_size_per_gpu: 8
num_epochs: 80
enable_qat: true
checkpoint_interval: 10
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-6
max_learning_rate: 1e-4
soft_start: 0.1
annealing: 0.5
regularizer {
type: L1
weight: 3e-9
optimizer {
adam {
epsilon: 0.001
beta1: 0.9
beta2: 0.999
amsgrad: false
pretrain_model_path: “/workspace/tao-experiments/yolov3/pretrained_resnet18/pretrained_object_detection_vresnet18/resnet_18.hdf5”
eval_config {
average_precision_mode: SAMPLE
batch_size: 8
matching_iou_threshold: 0.5
nms_config {
confidence_threshold: 0.001
clustering_iou_threshold: 0.5
top_k: 200
force_on_cpu: True
augmentation_config {
hue: 0.1
saturation: 1.5
horizontal_flip: 0.5
jitter: 0.3
output_width: 1280
output_height: 1280
output_channel: 3
randomize_input_shape_period: 0
dataset_config {
data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/tfrecords/kitti_trainval/*”
image_directory_path: “/workspace/tao-experiments/try-6”
include_difficult_in_training: true
image_extension: “jpg”
target_class_mapping {
key: “cachalot”
value: “cachalot”
target_class_mapping {
key: “jet”
value: “jet”
target_class_mapping {
key: “rorqual”
value: “rorqual”
target_class_mapping {
key: “bateau”
value: “bateau”
target_class_mapping {
key: “globicephale”
value: “globicephale”
target_class_mapping {
key: “queue_cachalot”
value: “queue_cachalot”
validation_fold: 0

I think I have a problem with the generation of my calibration file in TAO. I want to do inference on deepstream in int8 using this yolo v3 model (key = tlt_encode). When I do inference both in fp32 and fp16 everything works well, but when I use int8 it gives pretty bad results.

I’m pretty sure it comes from the calibration file. To generate it I export the model I train using tao-converter and then I use tao-deploy to generate the calibration file, both step on my RTX4090.

Can you tell me if my workflow is correct and if so can you check my calibration file and tell me what’s wrong ?

Thank you for your answer

To export the model, you can use yolo_v3 export xxx. It will generate .etlt model based on .tlt model.

Please note that tao-converter will generate tensorrt engine based on .etlt model. It will not generate .etlt model.

In latest docker, you can use tao-deploy to generate tensorrt engine instead of tao-converter.

For the incorrect calibration, please try to use entire training dataset when you run tao-deploy yolo_v3 gen_trt_engine to generate a new cal.bin.

I also observe that you are using an old version of TAO. In this case, you can use
tao yolo_v3 export to generate a new cal.bin.

Thank you for this very quick answer !

Sorry I mixed things up I used tao export to generate the etlt and then tao deploy for the cal.bin.

Here is my code for both (I used containers to be sure to be up to date).

For the export
docker run -it --rm --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864
-v .:/workspace nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5
yolo_v3 export -e /workspace/specs/experiment_spec.json
-m /workspace/yolo_v3/experiment_dir_final_2/weights/yolov3_resnet18_epoch_080.tlt
-o /workspace/yolo_v3/experiment_dir_final_2/resnet18_detector_qat.etlt
-k tlt_encode
–static_batch_size 1

For the cal.bin generation

docker run -it --rm --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864
-v .:/workspace nvcr.io/nvidia/tao/tao-toolkit:4.0.0-deploy
yolo_v3 gen_trt_engine
-e /workspace/specs/experiment_spec.json
-m /workspace/yolo_v3/experiment_dir_final_2/resnet18_detector_qat.etlt
-k tlt_encode
–cal_image_dir /workspace/try-6/train
–data_type int8
–cal_cache_file /workspace/yolo_v3/experiment_dir_final_2/cal.bin
–engine_file /workspace/yolo_v3/experiment_dir_final_2/trt.engine.int8

As you can see I’m using the train directory as source for the calibration file.

Please add below and retry. xxx means the total images in the training images folder.
--batches xxx
--batch_size 1

Thanks for the answer,

I tried it and the results are even worse … Like I get only one detection in the corner of my image and the same detection for all.

Here’s the obtained calibration file

Can you official yolo_v3 notebook to check if it works? It will train against KITTI dataset.

