Training Image classification - S3E3 - no dog issue

Hi guys,

I am following (coding along) Dusty’s tutorial on image classification jetson-inference/pytorch-cat-dog.md at master · dusty-nv/jetson-inference · GitHub, but i get a really strange behaviour while testing it.

So, i did train the resnet18 on the dataset as he mentioned in the video, testing it on the 100 cats dataset provides the expected results, but when testing on the 100 dogs dataset, all of them are classified as cats.

At the beginning, I thought this happens because i trained the network only for 1 epoch, then i retrained it for anoter 10 epochs and the result is still the same, all dogs are classified as cats.

Any idea why? where to look into?

this is for instance the last dog in the testing set
[image] loaded ‘data/cat_dog/test/dog/100.jpg’ (500x391, 3 channels)
class 0000 - 0.704574 (cat)
class 0001 - 0.295426 (dog)
imagenet: 70.45742% class #0 (cat)
[image] saved ‘data/cat_dog/test_output_dog/99.jpg’ (500x391, 3 channels)

cheers,

99

Hi @vektor29, what was the accuracy reported (Acc@1) when you trained your model?

Also, delete the .engine file in your model’s directory so that TensorRT re-generates it. It is probably still running off the old model.

To see if it is some other issue on the inferencing side, you can try my model that was trained for 100 epochs: https://nvidia.box.com/s/zlvb4y43djygotpjn6azjhwu0r3j0yxc

Hi @dusty_nv,

for 1 epoch was ~51%, while for 10 epochs is ~68%.

looks like by deleting the .engine file, it is able to run correctly, on the test set i got over 90% for the dogs test set and 40-45% for the cats test set. Did you use a much better trained network while recording the tutorial?

thanks for the tip!

OK cool. I used my model that was trained for 100 epochs and ~80% accuracy.