AP, precision and recall are remaining zero using the custom dataset. training the fasterRCNN with resnet_18

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
T4
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Faster_rcnn
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
docker_tag: v3.21.08-py3
• Training spec file(If have, please share here)

Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved.

random_seed: 42
enc_key: ‘NXVodTI0MXNnZGtzdXBic2o0cTIwbmp0bnA6N2IwZDEyMGYtMGZiOS00MDNlLTllOGMtOGMzOTJiYmRlMzk0’
verbose: True
model_config {
input_image_config {
image_type: RGB
image_channel_order: ‘bgr’
size_height_width {
#height: 2160
#width: 3840
height: 384
width: 1248
}
image_channel_mean {
key: ‘b’
value: 103.939
}
image_channel_mean {
key: ‘g’
value: 116.779
}
image_channel_mean {
key: ‘r’
value: 123.68
}
image_scaling_factor: 1.0
max_objects_num_per_image: 100
}
arch: “resnet:18”
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: False
roi_mini_batch: 256
rpn_stride: 16
use_bias: False
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
dataset_config {
data_sources: {
tfrecords_path: “/home/ubuntu/cv_samples_v1.2.0/data/tfrecords/kitti_trainval/kitti_trainval*”
image_directory_path: “/home/ubuntu/cv_samples_v1.2.0/data/training/”
}
image_extension: ‘jpg’
target_class_mapping {
key: ‘car’
value: ‘car’
}
target_class_mapping {
key: ‘hvac’
value: ‘hvac’
}
target_class_mapping {
key: ‘person’
value: ‘person’
}
validation_fold: 0
}
augmentation_config {
preprocessing {
output_image_width: 1248
output_image_height: 384
output_image_channel: 3
min_bbox_width: 1.0
min_bbox_height: 1.0
enable_auto_resize: True
}
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
training_config {
enable_augmentation: True
enable_qat: False
#batch_size_per_gpu: 1
batch_size_per_gpu: 1
num_epochs: 50
pretrained_weights: “/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/resnet_18.hdf5”
#resume_from_model: “/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/frcnn_kitti_resnet18.epoch2.tlt”
output_model: “/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/frcnn_kitti_resnet18.tlt”
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: ‘x’
value: 10.0
}
classifier_regr_std {
key: ‘y’
value: 10.0
}
classifier_regr_std {
key: ‘w’
value: 5.0
}
classifier_regr_std {
key: ‘h’
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

regularizer {
type: L2
weight: 1e-4
}

optimizer {
sgd {
lr: 0.002
momentum: 0.9
decay: 0.0
nesterov: False
}
}

learning_rate {
soft_start {
base_lr: 0.02
start_lr: 0.002
soft_start: 0.1
annealing_points: 0.8
annealing_points: 0.9
annealing_divider: 10.0
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0
}
inference_config {
images_dir: ‘/home/ubuntu/cv_samples_v1.2.0/data/training/resize_images’
model: ‘/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/frcnn_kitti_resnet18.epoch50.tlt’
batch_size: 1
detection_image_output_dir: ‘/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/inference_results_imgs’
labels_dump_dir: ‘/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/inference_dump_labels’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
object_confidence_thres: 0.0001
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
}
evaluation_config {
model: ‘/home/ubuntu/cv_samples_v1.2.0/faster_rcnn/frcnn_kitti_resnet18.epoch50.tlt’
batch_size: 1
validation_period_during_training: 1
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
gt_matching_iou_threshold: 0.5
}

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
I can not share the data

Hi,

I am running the fasterRCNN with a custom dataset, I have three classes called car, person, and hvac, I barely change the configuration file except for the learning rate from 0.02 to 0.2, I have all the mapping lower case and trained the model for 50 epochs, but in all the epochs the AP precision and recall are zero. In total, I have 120 images and I am just going to fine-tune the model The following is the format of the labels in one image

car 0 0 0 621 98 638 91 0 0 0 0 0 0 0
car 0 0 0 473 33 477 30 0 0 0 0 0 0 0
car 0 0 0 493 51 498 47 0 0 0 0 0 0 0
image
The file names are starting from 0 and all resized to the default input of the model
image

Am I missing something?
However, I am wondering to know that if the model will resize images on fly or if I can set different image sizes ( change the model input and output size)
Regards

Refer to FasterRCNN — TAO Toolkit 3.0 documentation

Please follow the format of Data Annotation Format — TAO Toolkit 3.0 documentation ,
For example,
cyclist 0.00 0 0.00 665.45 160.00 717.93 217.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Thanks for your response
However The training stops with more data

ss_loss: 0.4698 - dense_class_td_loss: 1.3495 - dense_regress_td_loss: 0.0609 - dense_class_td_acc: 0.7056Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/scripts/train.py", line 94, in <module>
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/scripts/train.py", line 82, in <module>
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/scripts/train.py", line 77, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/models/model_builder.py", line 747, in train
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1039, in fit
    validation_steps=validation_steps)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py", line 154, in fit_loop
    outs = f(ins)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: assertion failed: [Maximum number of objects in image exceeds the limit 100] [Condition x <= y did not hold element-wise:] [x (strided_slice_17:0) = ] [107] [y (assert_less_equal/y:0) = ] [100]
	 [[{{node assert_less_equal/Assert/Assert}}]]
	 [[proposal_target_1/cond_17/Min/Switch/_4487]]
  (1) Invalid argument: assertion failed: [Maximum number of objects in image exceeds the limit 100] [Condition x <= y did not hold element-wise:] [x (strided_slice_17:0) = ] [107] [y (assert_less_equal/y:0) = ] [100]
	 [[{{node assert_less_equal/Assert/Assert}}]]
0 successful operations.
0 derived errors ignored.

Regards

See FasterRCNN — TAO Toolkit 3.0 documentation

The maximum number of objects in an image depends on the dataset. It is important to set the parameter max_objects_num_per_image to be no less than this number. Otherwise, training will fail.

1 Like