Hi,
Please have a look at my result training my dataset on YOLOv3 using Transfer Learning Toolkit. I could achieve a reasonably good map when trained the dataset with detectnetv2 and able to transfer it to deepstream but it does not do with YOLOv3.
tlt-streamanalytics:v2.0_py3
Ubuntu 18.04
GPU Geforce 1650 4GB
Dataset :
Image : JPG ( 480*288)
Label : KITTI.
car 0.0 0 0.0 65.8 103.8 146.8 152.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 person 0.0 0 0.0 70.7 80.3 81.3 102.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
TF record config file
kitti_config {
root_directory_path: “/workspace/dataset/training”
image_dir_name: “image_2”
label_dir_name: “label_2”
image_extension: “.jpg”
partition_mode: “random”
num_partitions:2
val_split: 20
num_shards: 10
}
image_directory_path: “/workspace/dataset/training”
Train config file
random_seed: 42
yolo_config {
big_anchor_shape: “[(84.60, 40.60), (97.70, 61.50), (131.90, 102.10)]”
mid_anchor_shape: “[(63.00, 25.80), (44.20, 39.30), (69.00, 31.50)]”
small_anchor_shape: “[(10.60, 21.00), (15.70, 28.30), (36.00, 26.70)]”
matching_neutral_box_iou: 0.5
arch: “resnet”
nlayers: 18
arch_conv_blocks: 0
loss_loc_weight: 5.0
loss_neg_obj_weights: 50.0
loss_class_weights: 1.0
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1}
training_config {
batch_size_per_gpu: 5
num_epochs: 10
enable_qat: false
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 5e-5
max_learning_rate: 2e-2
soft_start: 0.15
annealing: 0.8
}
}
regularizer {
type: L1
weight: 3e-5
}
}
eval_config {
validation_period_during_training: 1
average_precision_mode: INTEGRATE
batch_size: 5
matching_iou_threshold: 0.3
}
nms_config {
confidence_threshold: 0.01
clustering_iou_threshold: 0.6
top_k: 200
}
augmentation_config {
preprocessing {
output_image_width: 480
output_image_height: 288
output_image_channel: 3
min_bbox_width: 1.0
min_bbox_height: 1.0
}
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 0.7
zoom_max: 1.8
translate_max_x: 8.0
translate_max_y: 8.0
}
color_augmentation {
hue_rotation_max: 25.0
saturation_shift_max: 0.20000000298
contrast_scale_max: 0.10000000149
contrast_center: 0.5
}
}
dataset_config {
data_sources: {
tfrecords_path: “/workspace/tf_records/*”
image_directory_path: “/workspace/dataset/training”
}
image_extension: “jpg”
target_class_mapping {
key: “person”
value: “person”
}
target_class_mapping {
key: “car”
value: “car”
}
target_class_mapping {
key: “bus”
value: “bus”
}
target_class_mapping {
key: “truck”
value: “truck”
}
target_class_mapping {
key: “motorcycle”
value: “motorcycle”
}
target_class_mapping {
key: “bicycle”
value: “bicycle”
}
validation_fold: 0
}
TF records results:
2020-09-27 22:53:39.245952: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Using TensorFlow backend.
2020-09-27 22:53:41,037 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2020-09-27 22:53:41,037 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Creating output directory /workspace/tf_records
2020-09-27 22:53:41,040 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 1039 Val: 259
2020-09-27 22:53:41,040 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2020-09-27 22:53:41,041 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.
2020-09-27 22:53:41,041 - tensorflow - WARNING - From /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.
/usr/local/lib/python3.6/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:273: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2020-09-27 22:53:41,063 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2020-09-27 22:53:41,082 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2
2020-09-27 22:53:41,101 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 3
2020-09-27 22:53:41,119 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 4
2020-09-27 22:53:41,137 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 5
2020-09-27 22:53:41,155 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 6
2020-09-27 22:53:41,172 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 7
2020-09-27 22:53:41,191 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 8
2020-09-27 22:53:41,209 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 9
2020-09-27 22:53:41,234 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
b’car’: 266
b’person’: 250
b’bus’: 53
b’motorcycle’: 3
b’truck’: 14
b’bicycle’: 7
2020-09-27 22:53:41,234 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2020-09-27 22:53:41,306 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2020-09-27 22:53:41,379 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 2
2020-09-27 22:53:41,451 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 3
2020-09-27 22:53:41,525 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 4
2020-09-27 22:53:41,599 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 5
2020-09-27 22:53:41,673 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 6
2020-09-27 22:53:41,752 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 7
2020-09-27 22:53:41,838 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 8
2020-09-27 22:53:41,922 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 9
2020-09-27 22:53:42,014 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
b’person’: 1037
b’car’: 935
b’truck’: 46
b’bus’: 206
b’motorcycle’: 12
b’bicycle’: 13
2020-09-27 22:53:42,014 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2020-09-27 22:53:42,014 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
b’car’: 1201
b’person’: 1287
b’bus’: 259
b’motorcycle’: 15
b’truck’: 60
b’bicycle’: 20
2020-09-27 22:53:42,014 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map.
Label in GT: Label in tfrecords file
b’car’: b’car’
b’person’: b’person’
b’bus’: b’bus’
b’motorcycle’: b’motorcycle’
b’truck’: b’truck’
b’bicycle’: b’bicycle’
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.
2020-09-27 22:53:42,014 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.
Train result
Epoch 8/10
208/208 [==============================] - 46s 220ms/step - loss: 10.3667
Epoch 00008: saving model to /workspace/trained_model_yolo/weights/yolo_resnet18_epoch_008.tlt
Number of images in the evaluation dataset: 259
Producing predictions: 100%|████████████████████| 52/52 [00:06<00:00, 7.55it/s]
Start multi-thread per-image matching
Start to calculate AP for each class
bicycle AP 0.0
bus AP 0.0
car AP 0.0
motorcycle AP 0.0
person AP 0.0
truck AP 0.0
mAP 0.0
Epoch 9/10
208/208 [==============================] - 45s 215ms/step - loss: 9.3539
Epoch 00009: saving model to /workspace/trained_model_yolo/weights/yolo_resnet18_epoch_009.tlt
Number of images in the evaluation dataset: 259
Producing predictions: 100%|████████████████████| 52/52 [00:06<00:00, 7.63it/s]
Start multi-thread per-image matching
Start to calculate AP for each class
bicycle AP 0.0
bus AP 0.0
car AP 0.0
motorcycle AP 0.0
person AP 0.0
truck AP 0.0
mAP 0.0
Epoch 10/10
208/208 [==============================] - 46s 220ms/step - loss: 8.7018
Epoch 00010: saving model to /workspace/trained_model_yolo/weights/yolo_resnet18_epoch_010.tlt
Number of images in the evaluation dataset: 259
Producing predictions: 100%|████████████████████| 52/52 [00:06<00:00, 7.52it/s]
Start multi-thread per-image matching
Start to calculate AP for each class
bicycle AP 0.0
bus AP 0.0
car AP 0.0
motorcycle AP 0.0
person AP 0.0
truck AP 0.0
mAP 0.0
I tried to reduce the Iou threshold to 0.5,0.3, and 0.2 and also using kmeans.py to obtain anchor shapes but still achieved 0 mAp.
The training shows a similar result (0 mAp) for 10 epochs, 50 epoch, and 120 epochs.
Are there any wrongs in my config file?
cheers!