Tao-deploy yolo_v4 inference KITTI .txt result mismatch with image plot

bke · June 13, 2023, 6:44am

Hi, I am doing evaluation of a trained YOLO4 model outside of TAO. What I need are the predicted labels in the KITTI .txt format.

When running the inference I get the images_annotated and labels. However, the labels do not contain the correct coordinates. This seems to be a bug I believe.

Command I run:

!tao-deploy yolo_v4 inference \
 --gpu_index=0 \
 -i /workspace/tao-experiments/test/image \
 -e $YOLO4_BASEDIR/yolo_v4_eval_spec.txt \
 -m $YOLO4_BASEDIR/yolo4_rtx5000_fp32_bs1.engine \
 -r $YOLO4_BASEDIR/inference \
 --batch_size 1

Example COCO image 000000415109 from annotated_images (I write down the image coordinates of the (x1, y1) and (x2, y2) positions):

The 000000415109.txt output:
person 0.00 0 0.00 208.127 79.538 478.843 415.359 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.993

Analysis of the difference:

Notice the mismatch in coordinates.

What is going on? It looks like the coordinates in the .txt files are based in the model resolution (which is 480x480 in my case). Also there seems to be some letter-boxing going on where the image is center-padded and this transformation is not being undone in the output to the .txt file.

Can you take a look at the source code?

bke · June 13, 2023, 6:46am

Please note that this issue seems specific to yolo_v4. It does not occur when using detectnet_v2.

Morganh · June 16, 2023, 4:43pm

Sorry for late reply. How about tao yolo_v4 inference ?

bke · June 20, 2023, 11:54am

UPDATE 22/JUN/2023 - Below test is wrong (see next post)

Thank you for the reply, I have tested it:

!tao yolo_v4 inference \
 --gpu_index=0 \
 --threshold 0.2 \
 -i /workspace/tao-experiments/test_part/image \
 -e $YOLO4_BASEDIR/yolo_v4_train_resnet18_kitti.txt \
 -m $YOLO4_BASEDIR/yolo4_rtx5000_fp32_bs1.engine \
 -o $YOLO4_BASEDIR/inference_tao/images_annotated \
 -l $YOLO4_BASEDIR/inference_tao/labels

The kitti-labels output is definitely better using tao instead of tao-deploy:

As you can see using tao the labels are a better match for the coordinates measured on the image. However, there is still a difference that cannot be explained by rounding:

In this image you see in yellow the bounding box plotted by tao inference. I also plotted the detection based on the kitty .txt label. You can see the difference.

Strangely this difference is worse on some images like this one:

Findings / conclusion
I was able to transform the wrong kitty labels output from tao-deploy so that they perfectly align with the labels plotted on the image using this transformation (will post code).

However, I am not sure what to do with the output from tao, it is much better, but why the difference?

Morganh · June 20, 2023, 4:15pm

Could you share the training spec file? Need to check the input width and height.
More, your test images have different kinds of resolution, right?

bke · June 21, 2023, 4:52am

UPDATE 22/JUN/2023
There was a bug in my evaluation code. The output of tao inference is correct. The images_annotated align with the output kitty .txt labels.

Conclusion - use tao inference and don’t use tao-deploy inference.

Below shows the result with perfect alignment of the annotation (yellow) and the plotted label from the kitty .txt.

Old answer (wrong)
Below is the training spec file. I use 480x480 model resolution. The images are from the COCO dataset which have different resolutions (max_width=640, max_height=480, keep_aspect=True).

What I don’t understand is why annotated image seems to show the correct bounding box that the model predicted while the exported kitty .txt label is different. Both are the same thing, the predictions. Regardless of any config, the output annotated_image and its kitty-.txt should be consistent.

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(158.62, 354.31),(249.36, 231.91),(294.31, 383.34)]"
  mid_anchor_shape: "[(58.12, 193.38),(149.85, 170.82),(96.37, 272.58)]"
  small_anchor_shape: "[(39.01, 111.43),(66.44, 66.55),(90.07, 123.66)]"
  box_matching_iou: 0.25
  matching_neutral_box_iou: 0.5
  arch: "resnet"
  nlayers: 18
  arch_conv_blocks: 2
  loss_loc_weight: 1.0
  loss_neg_obj_weights: 1.0
  loss_class_weights: 1.0
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 1
  num_epochs: 80
  enable_qat: false
  checkpoint_interval: 1
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
  visualizer {
   enabled: true
    num_images: 1
  }
  early_stopping {
    monitor: "loss"
    patience: 10
  }
  #pretrain_model_path: "/workspace/yolo_v4/pretrained_resnet18/pretrained_object_detection_vresnet18/resnet_18.hdf5"
  resume_model_path: "/workspace/yolo_v4/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_040.tlt"
}
eval_config {
  average_precision_mode: INTEGRATE
  batch_size: 1
  matching_iou_threshold: 0.45
  visualize_pr_curve: true
}
nms_config {
  confidence_threshold: 0.2
  clustering_iou_threshold: 0.45
  force_on_cpu: true
  top_k: 200
}
augmentation_config {
  hue: 0.1
  saturation: 1.5
  exposure:1.5
  vertical_flip:0
  horizontal_flip: 0.5
  jitter: 0.3
  output_width: 480
  output_height: 480
  output_channel: 3
  randomize_input_shape_period: 0
  mosaic_prob: 0.5
  mosaic_min_ratio:0.2
}
dataset_config {
  data_sources: {
      tfrecords_path: "/workspace/yolo_v4/train/tfrecords/*"
      image_directory_path: "/workspace/yolo_v4/train"
  }
  include_difficult_in_training: true
  image_extension: "jpg"
  #image_extension: "png"
  target_class_mapping {
    key: "bicycle"
    value: "bicycle"
  }
  target_class_mapping {
    key: "car"
    value: "car"
  }
  target_class_mapping {
    key: "motorbike"
    value: "motorbike"
  }
  target_class_mapping {
    key: "person"
    value: "person"
  }
  target_class_mapping {
    key: "truck"
    value: "truck"
  }
  validation_data_sources {
    tfrecords_path: "/workspace/yolo_v4/val/tfrecords/*"
    image_directory_path: "/workspace/yolo_v4/val/"
  }
}

system · July 5, 2023, 4:52am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Inference YOLO_v4 int8 mode doesn't show any bounding box TAO Toolkit	31	2540	November 12, 2021
Inferring Yolo_v3.trt model in python TAO Toolkit tensorrt	38	3364	October 12, 2021
Tao pre-trained yolo4tiny - AssertionError: Must have more boxes than clusters TAO Toolkit	54	2264	January 21, 2022
Yolo-v4 on colab - ModuleNotFound - No module named 'uff' TAO Toolkit tao	18	430	March 14, 2024
Troubles Replicating TLT Model Training Experiment with TAO TAO Toolkit	6	519	November 21, 2023
TAO DetectNet_v2 TAO Toolkit	7	446	November 8, 2023
TLT with YOLOv3 Achieved 0 MaP after 120 Epoch TAO Toolkit	10	885	October 12, 2021
Why after inference,it change Orientation? TAO Toolkit	6	490	November 21, 2021
Yolov4 image_mean error TAO Toolkit	12	954	April 10, 2023
Yolov3 worklfow or incorrect calibration file for int8 inference TAO Toolkit tensorrt , yolo , deepstream	6	523	July 6, 2023

Tao-deploy yolo_v4 inference KITTI .txt result mismatch with image plot

UPDATE 22/JUN/2023 - Below test is wrong (see next post)

Related topics