Faster RCNN: training

cbasavaraj · January 18, 2020, 1:04pm

Hi again, moving my previous question to a new thread as it is faster RCNN specific.

In addition, I have one more question on training. I get a lot of “No GT bboxes found in …” in spite of moving images without annotations out of the training folder. I have checked and rechecked the corresponding labels/image_xx.txt file, and the annotations do exist.

No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/yyOdbcpCQ6s__000289.jpg
No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/YMgJt9yBLgY__001842.jpg
No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/vt0hblIsHiY__000120.jpg
No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/F2Bw4OLZHq8__001746.jpg
No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/vlo00n25e18__000709.jpg
No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/6hT5xOszncI__000889.jpg
338/15736 […] - ETA: 2:36:47 - rpn_cls: 0.2377 - rpn_regr: 0.0102 - detector_cls: 0.0515 - detector_regr: 0.0169
No GT bboxes found in image/workspace/nvidia-tlt/data/KITTI/train/images/SLL5ziDWc6k__001346.jpg
339/15736 […] - ETA: 2:36:40 - rpn_cls: 0.2373 - rpn_regr: 0.0102 - detector_cls: 0.0515 - detector_regr: 0.0169
No positive ROIs.

In spite of this, the model trains ok (losses go down) and so I tried to evaluate after a couple of epochs. What I get is:

2020-01-18 11:35:06,570 [INFO] /usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/scripts/test.pyc: 640/642
2020-01-18 11:35:06,742 [INFO] /usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/scripts/test.pyc: Elapsed time = 0.172304868698
2020-01-18 11:35:06,750 [INFO] /usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/scripts/test.pyc: 641/642
2020-01-18 11:35:06,930 [INFO] /usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/scripts/test.pyc: Elapsed time = 0.179259061813

Class AP precision recall

my_class 0.0000 0.0000 0.0000

mAP = 0.0000

Here’s my training spec file:

random_seed: 42
enc_key: 'my_key'
verbose: True
network_config {
input_image_config {
image_type: RGB
image_channel_order: 'bgr'
size_height_width {
height: 576
width: 1024
}
    image_channel_mean {
        key: 'b'
        value: 103.939
}
    image_channel_mean {
        key: 'g'
        value: 116.779
}
    image_channel_mean {
        key: 'r'
        value: 123.68
}
    image_scaling_factor: 1.0
}
feature_extractor: "resnet:18"
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
conv_bn_share_bias: False
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
training_config {
kitti_data_config {
images_dir: '/workspace/nvidia-tlt/data/KITTI/train/images'
labels_dir: '/workspace/nvidia-tlt/data/KITTI/train/labels'
}
training_data_parser: 'raw_kitti'
data_augmentation {
use_augmentation: True
spatial_augmentation {
hflip_probability: 0.0
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
color_shift_stddev: 0.0
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
num_epochs: 6
class_mapping {
key: 'my_class'
value: 0
}
class_mapping {
key: 'background'
value: 1
}
pretrained_weights: "/workspace/nvidia-tlt/data/faster_rcnn/resnet18.h5"
pretrained_model: ""
output_weights: "/workspace/nvidia-tlt/ckpts/my_frcnn.tltw"
output_model: "/workspace/nvidia-tlt/ckpts/my_frcnn.tlt"
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: 'x'
value: 10.0
}
classifier_regr_std {
key: 'y'
value: 10.0
}
classifier_regr_std {
key: 'w'
value: 5.0
}
classifier_regr_std {
key: 'h'
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

reg_config {
reg_type: 'L2'
weight_decay: 1e-4
}

optimizer {
adam {
lr: 0.00001
beta_1: 0.9
beta_2: 0.999
decay: 0.0
}
}

lr_scheduler {
step {
base_lr: 0.00001
gamma: 1.0
step_size: 30
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0

inference_config {
images_dir: '/workspace/nvidia-tlt/data/KITTI/valid/images'
model: '/workspace/nvidia-tlt/ckpts/my_frcnn.epoch6.tlt'
detection_image_output_dir: '/workspace/nvidia-tlt/out/frcnn/images'
labels_dump_dir: '/workspace/nvidia-tlt/out/frcnn/labels_inference'
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
}

evaluation_config {
dataset {
images_dir: '/workspace/nvidia-tlt/data/KITTI/valid/images'
labels_dir: '/workspace/nvidia-tlt/data/KITTI/valid/labels'
}
data_parser: 'raw_kitti'
model: '/workspace/nvidia-tlt/ckpts/my_frcnn.epoch6.tlt'
labels_dump_dir: '/workspace/nvidia-tlt/out/frcnn/labels_evaluation'
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
}

}

Thanks for your help!

cbasavaraj · January 18, 2020, 1:10pm

For example, some of the label files reported as having no GT boxes:

root@82f3f056a10d:/workspace/nvidia-tlt# cat data/KITTI/train/labels/6hT5xOszncI__000889.txt
my_class 0.0 0 0.0 667.0 458.0 496.0 438.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
root@82f3f056a10d:/workspace/nvidia-tlt# cat data/KITTI/train/labels/F2Bw4OLZHq8__001746.txt
my_class 0.0 0 0.0 448.0 421.0 246.0 219.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
root@82f3f056a10d:/workspace/nvidia-tlt# cat data/KITTI/train/labels/vt0hblIsHiY__000120.txt
my_class 0.0 0 0.0 422.0 426.0 143.0 158.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
root@82f3f056a10d:/workspace/nvidia-tlt# cat data/KITTI/train/labels/YMgJt9yBLgY__001842.txt
my_class 0.0 0 0.0 959.0 401.0 215.0 247.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
root@82f3f056a10d:/workspace/nvidia-tlt# cat data/KITTI/train/labels/yyOdbcpCQ6s__000289.txt
my_class 0.0 0 0.0 661.0 427.0 371.0 455.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
root@82f3f056a10d:/workspace/nvidia-tlt# cat data/KITTI/train/labels/SLL5ziDWc6k__001346.txt
my_class 0.0 0 0.0 1163.0 786.0 107.0 155.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Morganh · January 18, 2020, 2:18pm

Hi,
Refer to Jupyter notebook, could you please do a cross check with KITTI dataset?
You can create a new training spec, modify its training dataset’s link, trigger test to see what’s happen.

I have not found obvious error in you spec yet. I am still afraid there is something mismatching in images link, label link, or images folder,etc.

Using another dataset to test, is a quick way for problem isolation.

cbasavaraj · January 18, 2020, 3:52pm

Training on KITTI in tlt-experiments works fine. I copied the spec files from tlt-experiments and changed only the paths and the class labels. It’s mystifying at the moment. Gonna take a break, will check on Monday, thanks.

cbasavaraj · January 21, 2020, 8:12pm

Hi, I figured out what was wrong. My bboxes were in [x, y, w, h] format. Once I translated to [x1, y2, x2, y2] format, started working fine. Thanks for the support!

Topic		Replies	Views
Training Custom FasterRCNN resnet50 Object detection issue TAO Toolkit	9	1234	October 12, 2021
IndexError: index 6 is out of bounds for axis 1 with size 6 while training by using FasterRCNN. TAO Toolkit	23	4223	October 12, 2021
Train faster rcnn with negative images TAO Toolkit	5	995	October 12, 2021
Error training Faster RCNN model TAO Toolkit	17	1756	October 12, 2021
ValueError: Image file not found for label TAO Toolkit	4	814	October 12, 2021
Unable to detect object after training TAO Toolkit	25	1325	October 12, 2021
Faster RCNN ROI issue TAO Toolkit	34	2196	October 12, 2021
An error occurred while training with TLT TAO Toolkit	11	855	October 12, 2021
AP, precision and recall are remaining zero using the custom dataset. training the fasterRCNN with resnet_18 TAO Toolkit	12	878	November 5, 2021
TLT training error : Key cost_sums/cyclist-bbox not found in checkpoint TAO Toolkit	6	1320	October 12, 2021

Faster RCNN: training

Class AP precision recall

my_class 0.0000 0.0000 0.0000

Related topics