Very low matching confidence (usually abbout 24%) when using own images and training setup with jetson-inference

I’m trying to train my Jetson Nano to recognise 5 of my own classes as per this:

I have about 90 images in the training set for each class and 15 images in each validation set and 0 images in the test set. I train to 100 epochs (I should have checked the accuracy to see if it converges but haven’t yet). In each of the training and validation sets, I have 3 different backgrounds (basically a different coloured T-shirt).

I am using these as my classes:

  • apple
  • gameboy
  • honey
  • suncream
  • toothpaste

When I export to onnx and run my model, I am sad to say it is nearly always 24% confident it is an apple, no matter what I present to the camera. It flickers to honey occasionally and if I present a gameboy, it does briefly flicker to gameboy but only once in a while.
The point stands that the confidence is always 24.XY% whatever the class.

What am I doing wrong? Do I need more training images? More validation images? Should I have used test images?

I’d like to check I am on the right path before I go and take another 5x100 = 500 images to see if that improves the training as this will take me the best part of an hour.

Many thanks.

Hi,

Please collect more database that covers more environment variation and do wait for the model converge.

Thanks.

1 Like

Thanks for the reply.

When you say “wait for model to converge” do you mean look at the accuracy vs epoch graph?

Hi @jetsonnvidia, basically it means to wait for the training accuracy to level off. In most case in my experience, 100 epochs should be fine for smaller classification datasets like this.

One thing you can try, is to add classes gradually into the dataset, so that you can determine which class is throwing it off. You can do this by temporarily moving the folders of those classes out of your dataset, and removing it from your labels.txt. For example, you could try removing apple and just keeping a couple other classes. If that works better, start adding classes back in. Typically you can test a model after training it for 30 epochs or so.

If you want to avoid having one of the classes detected when in fact nothing is in front of the camera, you can add a “background” class with images of just the background.

1 Like

Are we talking about this information as displayed after each epoch:

* Acc@1 72.000

?

If so, I have these lines in 35 epochs:

 * Acc@1 20.000 Acc@5 100.000
 * Acc@1 37.333 Acc@5 100.000
 * Acc@1 36.000 Acc@5 100.000
 * Acc@1 42.667 Acc@5 100.000
 * Acc@1 33.333 Acc@5 100.000
 * Acc@1 34.667 Acc@5 100.000
 * Acc@1 46.667 Acc@5 100.000
 * Acc@1 37.333 Acc@5 100.000
 * Acc@1 45.333 Acc@5 100.000
 * Acc@1 46.667 Acc@5 100.000
 * Acc@1 44.000 Acc@5 100.000
 * Acc@1 56.000 Acc@5 100.000
 * Acc@1 40.000 Acc@5 100.000
 * Acc@1 57.333 Acc@5 100.000
 * Acc@1 49.333 Acc@5 100.000
 * Acc@1 62.667 Acc@5 100.000
 * Acc@1 58.667 Acc@5 100.000
 * Acc@1 58.667 Acc@5 100.000
 * Acc@1 61.333 Acc@5 100.000
 * Acc@1 64.000 Acc@5 100.000
 * Acc@1 60.000 Acc@5 100.000
 * Acc@1 53.333 Acc@5 100.000
 * Acc@1 66.667 Acc@5 100.000
 * Acc@1 68.000 Acc@5 100.000
 * Acc@1 61.333 Acc@5 100.000
 * Acc@1 62.667 Acc@5 100.000
 * Acc@1 68.000 Acc@5 100.000
 * Acc@1 60.000 Acc@5 100.000
 * Acc@1 72.000 Acc@5 100.000
 * Acc@1 70.667 Acc@5 100.000
 * Acc@1 68.000 Acc@5 100.000
 * Acc@1 69.333 Acc@5 100.000
 * Acc@1 76.000 Acc@5 100.000
 * Acc@1 74.667 Acc@5 100.000
 * Acc@1 72.000 Acc@5 100.000

72% doesn’t seem too bad but my results in practice are terrible (24% confidence in all classes as mentioned).

Yes, that is what I meant, so the training seems ok.

Can you try deleting the .engine file from your model’s directory so that TensorRT will re-create the engine?

Also, can you try running the imagenet program on your validation images to see if it classifies those ok? As opposed to starting with the camera. You can use this syntax for processing a whole folder of images at once:

Were you able to train the cat/dog model ok? It would be good to know if that worked for you and if the issue is related to this custom dataset.