How to train TAO Toolkit models on COCO Dataset?

benjamin.hogm · May 19, 2023, 3:33am

Description
While performing transfer learning of PeopleNet model () on COCO dataset using TAO Toolkit, several training performance-related issues were encountered. The issues are summarised below.

*Issue Summary
Unable to improve person and bag detection through transfer learning. Baseline performance was < 10% mAP before training and after training, no improvements were seen. In some cases, a degradation of performance was encountered (0% mAP).

Approaches used:

using a balanced dataset with same number of labels for each class ~12,000
using a range of initial learning rates from 5e-6 to 0.1
removed all other classes from COCO dataset and only retaining person and bag classes
training with various resolutions - (960, 544), (1280, 720)
with and without freezing layers
using ground truth bounding box labels that are tightly cropped to the objects

Configuration

Hardware (T4/V100/Xavier/Nano/etc): NVIDIA GeForce RTX 3070 Ti, T4
Network Type - Detectnet_v2
TLT Version - 4.0.0
Training spec file - refer to attachment
How to reproduce the issue - refer to command below and attached logs
training_spec.txt (4.6 KB)
logs.txt (950.7 KB)

tao detectnet_v2 train -e /tlt_exp/experiments/training_spec.txt -r /datasets/experiments/ -n "peoplenet_model" -k "tlt_encode" --gpus 1

Morganh · May 19, 2023, 5:59am

Since the original images in COCO dataset have different kinds of resolutions, it is needed to set enable_auto_resize: true in the spec file. Please set it and train again.
More, since the COCO images have various background or context, it may be quite different from the dataset mentioned in peoplenet model card PeopleNet | NVIDIA NGC . Thus, degradation of performance comparing to the model card may happen.

benjamin.hogm · May 19, 2023, 7:17am

@Morganh Hi, we have already resized the images to the same resolution and adjusted the labels prior to loading them in TAO Toolkit.

This was done by setting a fixed resolution. For example, 1280 x 720, then scaling the image to the closest size, while maintaining the aspect ratio, then padding the image to achieve target resolution.

Morganh · May 19, 2023, 7:57am

OK, could you share several images and their labels?

benjamin.hogm · May 19, 2023, 9:06am

@Morganh Hi, here are two sample images from the dataset, with their bounding boxes
image_and_labels.zip (373.6 KB)

Morganh · May 19, 2023, 4:15pm

Could you please use the original images/labels and then use set enable_auto_resize: true to train the two classes again?

benjamin.hogm · May 23, 2023, 1:47am

@Morganh Hi, we have tried the proposed solution with no changes in training performance.

We understand that COCO images are different from the dataset used to train Peoplenet and hence degradation of performance is possible.

If there is any other training configuration we can try, please let us know. Otherwise, we will consider this query closed.

Thank you.

Morganh · May 23, 2023, 2:04am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

For your existing result, is the mAP still near 0 from tao evaluation? Could you run tao inference to double check?

More, you can also train from scratch against your training dataset. That means training without the peoplenet pretrained model. Usually, mAP 0 is not expected. For input size, you can consider using average resolution of the training dataset. For example, 512x512.

system · June 6, 2023, 2:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
While Using Peoplenet model for Transfer learning, got bad result TAO Toolkit	25	1073	March 10, 2022
Zero average accuracy on custom dataset TAO Toolkit	11	663	October 11, 2023
0 mAP after switching from TAP 4 to TAO 5 TAO Toolkit	2	26	November 1, 2024
mAP training several classes = 0.0 and so low with data custom using detectnet_v2 (resnet_18)) TAO Toolkit	33	477	February 1, 2024
No detections after training PeopleNet using custom labeled data TAO Toolkit	7	867	October 12, 2021
DetectNet V2 TAO 5.5 average_precision very low or zero TAO Toolkit	11	47	January 22, 2025
Training acc is too low than expected: Peoplenet on custom dataset TAO Toolkit	14	529	November 15, 2022
NVIDIA TAO - detectnet_v2 - 0mAP problem TAO Toolkit	12	1001	November 9, 2021
YOLOv4 accuracy difference between TAO and Darknet TAO Toolkit	5	1491	October 12, 2021
PeopleNet. Coverage output is always zero TAO Toolkit	7	569	May 7, 2022

How to train TAO Toolkit models on COCO Dataset?

Related topics