I am getting a a 0.0 average precision during a detectnet_v2 training.
!tao model detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti-1Class.txt \
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
-n resnet18_detector \
A Sample annotation file:
rumex 0.0 0 0 1830 1195 1996 1348 0 0 0 0 0 0 0
• Hardware: T4
• Network Type: Detectnet_v2
• TAO Version: 5.0.0
• Training spec file
} validation_fold: 0
project: "TAO DetectNet 1 Class"
project: "TAO Toolkit Wandb Demo"
Please try to set
enable_auto_resize to true. More info can be found in DetectNet_v2 - NVIDIA Docs
I’ve modified the Proprocessing config as follows:
And then, I am rerunning the same training cell.
The train halts with an Exit code 1.
I thought this might come from the fact that enable_auto_resize was False in the previous checkpoint. So, I deleted all the previous checkpoint so that it starts with new configuration from scratch. Now, the command runs, but the average precision is still 0%.
I also have two other questions:
- Why would the enable_auto_resize parameter affect the average precision?
- My images are all of the same size. Why would, at all, enable_auto_resize have effect?
enable_auto_resize parameter is to train with multiple resolutions images. Since your training images are all of the same size, it is not needed.
Seems that the objects are small, please refer to Frequently Asked Questions - NVIDIA Docs ,
In DetectNet_V2, are there any parameters that can help improve AP (average precision) on training small objects?
Following parameters can help you improve AP on smaller objects:
num_layers of resnet
class_weight for small objects
- Increase the
coverage_radius_y parameters of the
bbox_rasterizer_config section for the small objects class
minimum_height to cover more small objects for evaluation.
Hi @Morganh. Thanks for your input.
I did actually tweaked these values. Things improved a bit but not as expected. Is there an official paper about the detectnetv2 explaining the mathematical/algorithmic meaning of these hyperparameters? Working with them without know what they mean is a bit like working in the dark.
You can refer to user guide DetectNet_v2 - NVIDIA Docs and the source code.
minimum_detection _ground_truth_overlap: Minimum IOU between ground truth and predicted box after clustering to call a valid detection. This parameter is a repeatable dictionary and a separate one must be defined for every class.
minimum_height: Minimum height in pixels for a valid ground truth and prediction bbox.
cov_radius_x (float): x-radius of the coverage ellipse
More, please share your latest spec file and training log.
For your original training images, are they of the same resolution? What is the resolution?
specs.txt (3.7 KB)
This is the last specs file.
The original (330) images are of resolution: 2048x1376. I do not change this resolution during the training.
The mAP with the above run looks like this:
I think my current direction is to further tune these two parameters:
- class_weight: would this related somehow to class frequency from the whole dataset? Does it have a particular effect if I have one class and background only? What does it mean in practice to make it bigger or smaller?
- coverage_foreground_weight: this class probably makes sense in my case because my bounding boxes contain weeds ==> which means that the box itself also contains a lot of background. Rougly, my weeds leaves cover 50% of the bounding box. It is wise to use 0.5 instead of 0.05 (the default)?
Please share the training log as well.
Also, is it possible to share several training images and their labels? You can share with me by sending private message.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks
Received the dataset. Some images are missing the objects. Suggest to label more and improve the label quality.
More, seems that this detection task is a bit difficult. Some images are difficult for human eyes to find the rumex object. The rumex looks very similar to the green background.
Suggest to use yolov4 and deeper backbone to train. Also, D-DETR and DINO can be considered as well.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.