Mix propriertary and public dataset for retrain

You can crop based on the bboxes.
You can modify your comments and delete some private images.

This approach can be an option.

More, as mentioned earlier, please try to run more experiments.
You already get result of

  • detectnet_v2 network + resnet18 backbone
    bicycle 34.8536
    electric_bicycle 81.7761

Please try below.

  • Detectnet_v2 network + resnet50 backbone , train on 2 classes (bicycle and electric_bicycle)
  • Yolov4_tiny network + resnet18 backbone, train on 2 classes (bicycle and electric_bicycle)

I’m on training this, and aware the process would take much longer time(about 8 hours on RTX3090-24G, batch size 16, epoch 80) than training for detectnet_v2 resnet18 on same 2 class dataset, and finnally ran out of GPU memory:

ETA: 4:05 - loss: 159.94282022-03-09 09:07:09,735 [ERROR] iva.common.utils: Ran out of GPU memory, please lower the batch size, use a smaller input resolution, use a smaller backbone, or enable model parallelism for supported TLT architectures (see TLT documentation).

this is the training spec:

random_seed: 42
yolov4_config {
big_anchor_shape: “[(498.00, 489.00), (427.00, 326.00), (311.00, 417.00)]”
mid_anchor_shape: “[(210.00, 257.00), (101.00, 161.00), (60.00, 43.00)]”
box_matching_iou: 0.25
matching_neutral_box_iou: 0.5
arch: “cspdarknet_tiny”
loss_loc_weight: 1.0
loss_neg_obj_weights: 1.0
loss_class_weights: 1.0
label_smoothing: 0.0
big_grid_xy_extend: 0.05
mid_grid_xy_extend: 0.05
freeze_bn: false
#freeze_blocks: 0
force_relu: false
}
training_config {
batch_size_per_gpu: 16
num_epochs: 80
enable_qat: true
checkpoint_interval: 10
learning_rate {
soft_start_cosine_annealing_schedule {
min_learning_rate: 1e-7
max_learning_rate: 1e-4
soft_start: 0.3
}
}
regularizer {
type: L1
weight: 3e-5
}
optimizer {
adam {
epsilon: 1e-7
beta1: 0.9
beta2: 0.999
amsgrad: false
}
}
pretrain_model_path: “/workspace/tao-experiments/yolo_v4_tiny/pretrained_cspdarknet_tiny/pretrained_object_detection_vcspdarknet_tiny/cspdarknet_tiny.hdf5”
}
eval_config {
average_precision_mode: SAMPLE
batch_size: 16
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.001
clustering_iou_threshold: 0.5
force_on_cpu: true
top_k: 200
}
augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 960
output_height: 1280
output_channel: 3
randomize_input_shape_period: 10
mosaic_prob: 0.5
mosaic_min_ratio:0.2
}
dataset_config {
data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/training/tfrecords/train*”
image_directory_path: “/workspace/tao-experiments/data/training”
}
include_difficult_in_training: true
image_extension: “png”
target_class_mapping {
key: “electric_bicycle”
value: “electric_bicycle”
}
target_class_mapping {
key: “bicycle”
value: “bicycle”
}
validation_data_sources: {
tfrecords_path: “/workspace/tao-experiments/data/val/tfrecords/val*”
image_directory_path: “/workspace/tao-experiments/data/val”
}
}

dataset statistics:

2022-03-09 06:09:18,812 [INFO] root: Cumulative object statistics
2022-03-09 06:09:18,812 [INFO] root: {
“bicycle”: 1644,
“people_unused”: 3918,
“electric_bicycle”: 1853,
“another_unused_custom_object”: 2034
}
image

is this common?

It is not normal. Did you use kmeans to generate anchor shapes for your labels?

Now I reduced the batch size to 12 as the ran out of GPU memory failed my last try, but seems the memory usage is the same with 16:

GPU memory usage by nvidia-smi: 19173MiB / 24576MiB

yes, the command I used (I treat -x and -y as image’s width and height):

!tao yolo_v4_tiny kmeans -l $DATA_DOWNLOAD_DIR/training/label_2 \
                          -i $DATA_DOWNLOAD_DIR/training/image_2 \
                          -n 6 \
                          -x 960 \
                          -y 1280

the output:

2022-03-09 14:03:40,440 [INFO] root: Registry: [‘nvcr.io’]
2022-03-09 14:03:40,473 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
2022-03-09 14:03:40,487 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/shao/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
Start optimization iteration: 1
Start optimization iteration: 11
Start optimization iteration: 21
Please use following anchor sizes in YOLO config:
(60.00, 43.00)
(101.00, 161.00)
(210.00, 257.00)
(311.00, 417.00)
(427.00, 326.00)
(498.00, 489.00)
2022-03-09 14:03:43,324 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I then put those 6 tuples into yolo_v4_tiny_train_kitti.txt:


yolov4_config {
big_anchor_shape: “[(498.00, 489.00), (427.00, 326.00), (311.00, 417.00)]”
mid_anchor_shape: “[(210.00, 257.00), (101.00, 161.00), (60.00, 43.00)]”

From your previous result, the loss is keeping decreasing. I am afraid it is normal.

Could you try to set lower bs(for example,4) and try to train again? Please use a new result folder.

Ok, I purged the folder: experiment_dir_unpruned, set the batch_size to 4 both for train and val, and restart the training.
Seems the time for one epoch is similar as before: around 370 seconds.
The GPU memory usage: 10998MiB / 24576MiB.

Refer to YOLOv4-tiny — TAO Toolkit 3.22.05 documentation
if mosaic augmentation is disabled (mosaic_prob=0), training with TFRecords format is faster.

Either detectnet_v2 or yolov4_tiny can be deployed with deepstream. For yolov4_tiny, you can refer to https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/tree/master/configs/yolov4-tiny_tao and YOLOv4-tiny - NVIDIA Docs

For detection, I suggest you to check what is the test scenario for your model. Usually it is better to train images which are similar to the actual test scenario. From your above training images, there are lots of scenarios. To improve mAP, if possible, suggest you to add more training images of the actual test scenario where your model will be used.

Hi Morgan,
My business doesn’t need to detect bicycle (and won’t possible to have enough dataset for it), the reason to include bicycle from public dataset is the original model (based on detectnet_v2) could not distinguish bicycle and eletric-bicycle.

As I done the experiment of 2 classes classification and 2 classes of yolo_v4_tiny, both proved the 2 classes can be better separate, I just wonder, does this mean I should switch to yolo_v4_tiny, or anything else I can still improve on detectnet_v2?

OK, your business doesn’t need to detect bicycle. So, as far as I known, you will train below classes.

  • “electric_bicycle”
  • “people”
  • “another_custom_obj”

I suggest you to reorg the training images. All the training images come from your proprietary dataset.
Then train with yolo_v4_tiny and then detectnet_v2.

I think I’ve already done the suggested job for detectnet_v2 with full classes at the very begining, the model does not show good when there’s bicycle come infront of camera, it mis-treat the bicycle as eletric-bicycle, this is not acceptable for my business, since the latter object will trigger alarm.

Do you mean the training which are trained on proprietary dataset and some public images?
I am afraid the public images are quite different from your proprietary dataset(i.e. target scenario). It will bring negative effect during training. That’s the reason why I suggest you reorg the training dataset and then retry.

at the very begining, I trained detectnet_v2 fully on my proprietary dataset (2000 labels for each eletric-bicycle, people, and another_custom_obj) with no public dataset data, the result is not good when the bicycle come in front of camera, the model shows it as eletric-bicycle.
Then, from the suggestion, by adding one more dedicated class of bicycle may helps, since I have no way to collect bicycle data from my production scenario, then have to turn to public dataset of bicycle, no luck, the trained model still shows no good.

Do you have the AP and mAP for this training?

my previsou post is actually asked right after the full proprietary dataset training, I didn’t keep the mAP, but as I remeber, the eletric-bicycle AP is about 50%, obvious lower than other classes, and from the inference testing, showing bad.

OK, please consider below ways.

  • If possible, please increase more training images of target scenario.
  • Use larger backbone
  • Train with YOLOv4_tiny firstly

by my eye comparison for annotated images in inference testing folder for 2 models with same dataset:

  • detectnet_v2
  • yolo_v4_tiny

can obviouse see yolo_v4_tiny have better accuracy, that merely see confusing object, does this mean anything?

Either detectnet_v2 or yolov4_tiny can have a quick fps. And also they have a good mAP against public KITTI dataset. That’s why I suggest you to try yolov4_tiny as well.
Actually some ngc models (tracfficamnet, dashcamnet, etc) are based on detectnet_v2 network. They are all playing well.
For you case, if use detectnet_v2, maybe need to finetune parameters or use deeper backbone,etc.
More, deep Metric Learning is also on the tao roadmap. It may be helpful for this case.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.