What are the width and height arguments for in imagenet-console? The documentation explains them as “the width” and “the height” which is kind of obvious, but it doesn’t say what they’re the width or height of, the network or the image.
I have a 28x28 MNIST network analyzing 280x280 pictures (I just drew the digits by hand). I don’t know if I’m supposed to specify --with=28 --height=28, or --width=280 --height=280, or just leave them off.
I made three scripts (using different width/height arguments) and tested each one of them. They all produce identical output, so it looks like the width and height arguments to imagenet-console.py are just ignored.
So I’m using the command ./jetson-inference/build/aarch64/bin/imagenet-console.py --model=./mnist.onnx --labels=./mnist_labels.txt --input_blob=input_0 --output_blob=output_0 ./five.jpg ./five-classified.jpg
It classifies all digits correctly except for 0, 2, and 9, which are all classified as “three”.
It also reports 100% confidence on 0, 1, 2, 3, 4, 5, 6, 7, and 8. For 9, it’s only 99.999988% certain that the 9 is “three”, which looks like roundoff error from 100%. How can it be 100% certain for any input? It shouldn’t be reporting 100% certainty for anything.
When I trained the model with Pytorch, it only reported 98.82% accuracy on the test set (see gist above). The model’s output tensor is always a ten-element array of values between 0 and 1. There are no 1.000 values.
I suspected it might be a softmax problem at the output, but I checked and I did incorporate a final softmax layer into the model. So I’m puzzled as to what’s going on here.