Please provide the following information when requesting support.
• Hardware NVIDIA A5000x8
• Network Type (Detectnet_v2)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) format_version: 2.0 / toolkit_version: 3.22.05 / v3.21.11-py3 detectnet_v2_train_resnet18_kitti.txt (7.0 KB)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
I’m trying to train a custom detection model using resnet18 for 4 classes (cars, pickup, truck, tir (another truck type). My issue is that the model seems to not be working properly, I have multiple detections for the same object.
I tried to train with a batch size = 1, but the mAP will be 15-20%.
I used a script to resize them. They are all 1280x720 and the labels are correct. Dataset size:
b’car’: 26098
b’truck’: 4378
b’tir’: 2842
b’pickup’: 765
b’truck2’: 2662
b’van’: 713
I had some warnings but I removed all the objects that had issues. It works fine with 4 batch sizes per GPU, and min_learning_rate: 5e-06, max_learning_rate: 5e-04, but I don’t like the results, even if the mAP is good.
Below are the results for the first train, and after retraining it became better, but I have multiple detections for the same class. For example, I got two detections on the truck with a car label and a truck label when I use: !tao detectnet_v2 inference. This is the case for multiple images that I used for inference. I checked all the images, they are correctly labeled. Any idea why or if there is anything that I can try?
Validation cost: 0.000286
Mean average_precision (in %): 74.7897
class name average precision (in %)
car 78.9247
pickup 57.8026
tir 89.1348
truck 73.2968
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks
Could you try to run inference against some of the training images?