I’ve been doing the Hello AI World tutorial and am on the " Re-training on the Cat/Dog Dataset" lesson: https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-cat-dog.md
Everything seems to work without error. But when I run the video feed, everything (walls, person, etc) is classified as 50% dog. When I point the video camera at an actual dog, the percentage jumps up to 65% or so.
That level of accuracy seems disappointing. Are there any pointers for improving this? Have these types of results been seen with others, or is this type of result an indicator of incorrect training? Perhaps a video feed has less accurate results than a single still image?
Any advice is appreciated.
Please noticed that this is a classification model rather than a detection model.
The pipeline feeds the whole image without ROI cropping into the DNN model directly.
Accuracy may drop if the testing image is relatively small compared to the datatbase.
Hi @jim_nvidiadev, in retrospect what I may have needed to do when creating the cat/dog dataset is to make a third ‘background’ class of images that have neither cat nor dogs present. As it stands today, the model is pretty good at classifying cat vs. dog, but as you have found if you present it an image with neither it is forced to still say cat or dog since there isn’t that background class.
Alas, the cat/dog example was just meant to be a very simple exposure to training your first model. If you check out this video where I collect/train another classification model on hand tools, I add a ‘background’ class there and it works well:
Very cool. I’ll check it the video and also press ahead with the “Re-training SSD-Mobilenet” page. Thanks so much!