Hi,
I’ve been trying to train a custom object detection model using DetectNet_v2
in TAO Toolkit 5.5.0, but after a full 80-epoch training process, I always get 0.0% mAP in evaluation. I’ve double-checked everything (dataset, preprocessing, config files, input sizes, etc.) and even tried exporting and running inference with the trained model, but the outputs are still invalid. I just tried with a lot of configurations but nothing.
I’m reaching out for support because I believe I’ve followed the documentation correctly and tried all the common fixes without success. Below is my system setup and attached are the full config files, logs, dataset examples, and TFRecords for full reproducibility.
System Setup
- OS: Ubuntu 22.04.5 LTS
- GPU: NVIDIA GeForce RTX 4070 SUPER (12GB)
- CUDA: 12.9
- Driver: 576.28
- TAO Toolkit version: 5.5.0
- Running environment: WSL2 with virtualenv and Docker
- TAO invoked with:
tao
CLI inside TAO virtualenv
Problem Summary
- Dataset based on KITTI format, with 3 classes:
car
,motorcycle
, andvan
. - All images resized to 480x272 (multiples of 16) and RGB 3-channel format.
- TFRecords successfully created and verified.
- Spec files follow official documentation, using
resnet
backbone. - Training finishes normally with no errors.
- But evaluation returns 0.0 mAP for all classes, even after 80 epochs.
- Exported ONNX model produces empty detections on inference.
Help Requested
Could anyone from the community or NVIDIA team help me understand why I’m getting 0.0% mAP even though the training runs correctly and the dataset seems to be in order?
Am I missing something subtle in the specs or preprocessing? I’m happy to provide any additional detail or test any suggestions.
I can provide the dataset if u want.
Thanks in advance!
kitticonf.txt (532 Bytes)
modelconf.txt (5.6 KB)
convert.log (9.2 KB)
train.log (560.7 KB)
As you can see, just car have 17%. I dont know what can i do . Maybe more epochs?
Commands Used
These are the exact commands I used to prepare the dataset and start training:
Convert KITTI to TFRecords
tao model detectnet_v2 dataset_convert \
-d /workspace/tao-experiments/vehdet_3/kitticonf.txt \
-o /workspace/tao-experiments/vehdet_3/tfrecords/tfrecords \
--gpus 1 \
--num_processes 1 \
--log_file /workspace/tao-experiments/vehdet_3/convert.log \
--results_dir /workspace/tao-experiments/vehdet_3/results \
-v
tao model detectnet_v2 train \
--gpus 1 \
--num_processes 1 \
-e /workspace/tao-experiments/vehdet_3/modelconf.txt \
-r /workspace/tao-experiments/vehdet_3/results_train8 \
-n vehdet3 \
-v