I trained the model for about 200 epochs. During training, I had a test split on which the model was validated. After the end of the training, I chose several checkpoints with the best performance in terms of metrics and made an inference on the same test split. I just wanted to see the model work visually. When the inference ended, I was confused by some of the predictions and made my own evaluation based on the inference. And it turned out that the metrics are completely different. To calculate metrics based on inference, I used the TIDE library.
!tao detectnet_v2 evaluate -e $SPECS_DIR/trafficcamnet_finetune.txt\ -m $USER_EXPERIMENT_DIR/1f_compile_4/model.step-207788.tlt \ -k $KEY
I am also attaching spec files for inference and training (there are settings for evaluation)