different prediction result on DIGITS and TR

We trained a new dataset and model by follow “two days to a demo.”
And deployed images on DIGITS and TX2 with JP3.1.(JP3.)
But we got the different prediction results on DIGITS and TX2.
For example: 1.class A on DIGITS got correct predictions and high accuracy around 98%.
2.class A on TX2 got class B prediction and low a accuracy around 45%.

TX2: TX2 EVB board with JP3.1 (OS & AP)

The thing is we got a same prediction result as tested a GoogleNet-ILSVRC12-subset.
We trained and deployed GoogleNet-ILSVRC12-subset model on DIGITS and TX2.
And got the same prediction results. (see attached.)

Thank you for any suggestion.


Hi,

Please check this comment:
[url]https://devtalk.nvidia.com/default/topic/993552/jetson-tx1/detection-result-difference-between-jetson-inference2-3-and-digits5-1/post/5097211/#5097211[/url]

Thanks.

Hi,
Thank you for your prompt support.
I got different classify result as I turned off cuda mean subtract.
However, the result was still not correct.
Plus, I did not see any different as I disable FP16 or added mean_binary.
Is there any information about how to adjust the cuda mean value?
How to match DIGITS used?

Thank you,

Hi,

Could you share the detection results since the image in comment #1 looks incorrect.

Not sure if you met the same issue as topic 993552.
They met dual mean-subtraction issue:

  1. Jetson_inference will do mean subtraction
  2. Transformed layer also does mean subtraction again

But from this prototxt, there is no transformed layer.
I will check this case later.

Hi,

I just tried GoogleNet-ILSVRC12-subset with this command: (No modification)

$ ./imagenet-console bird_0.jpg output_0.jpg \
--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_184080.caffemodel \
--labels=$NET/labels.txt \
--input_blob=data \
--output_blob=softmax

Accuracy is same as DIGITs, ex bike is 99.67%
Could you check it again?

Hi,

I did follow GoogleNet-ILSVRC12-subset command you posted to test the model we built on TX2.
And got the result I posted. (will send the screenshot later)
I also tested bird-0.jpg on DIGITS & TX2/TR and got the same accuracy on both.

I’m confused by the different accuracys on DIGITS and TR.
Also found different models I built on DIGITS and TR could have different accuracy by following GoogleNet-ILSVRC12-subset command.

Thank you,

Hi,

For classification, there are two main different:

TensorRT does some optimization for acceleration:

  1. Mean
    DIGITs use mean.binaryproto but jetson_inference using a mean-set across all the pixel.
    https://github.com/dusty-nv/jetson-inference/blob/master/imageNet.cpp#L278

Thanks.

Hi

Thank you for your support.
We are still working on our new dataset.
As I adjusted the mean_value and got different inferenced result on Tx2.
But it’s hard to know what mean_value should I use.
Due to DIGITS mean.binaryproto is not readable.
Is there a way to know how to choose the mean_value to match DIGITS’s presiction result?
I will send you one of our model later.

Thanks

Hi HuiW, please make sure Subtract Mean Pixel is selected when performing this step:

[url]GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Otherwise it is expected to subtract the mean binary image, where each pixel has a different mean. Currently jetson-inference uses mean pixel for maximum runtime performance. The default mean pixel value is the one used commonly with GoogleNet in the literature. However you can set your own in the code here: [url]jetson-inference/imageNet.cpp at 1a1f8697ea04fbc12e078aebccda5d2a2204cd1c · dusty-nv/jetson-inference · GitHub

Hi Dusty,

Thank you for your support.
Yes. Subtract Mean Pixel was selected as we build the model.
Also, I have tried several code mean pixel value already and got the different inference result.
But I have not found the correct one yet.

The other thing is the mean pixel value can be different as the dataset is changed.
Actually, we sre still working on our dataset and struggle on what mean pixel value should we use on TX2.

That’s why I asked “Is there a way to know how to choose the mean_value to match DIGITS’s presiction result?”

Thank you,

Hi,
As discussed above, my prediction results are returning lower confidence values. If I try inferencing the same .caffemodel with caffe python scripts, it’s returning confidence of over 90%. I have trained the AlexNet model in DIGITS with the mean value set to - Pixel.

I have replaced the cudaPreImageNetMean with cudaPreImageNet, so that duplicate mean subtraction doesn’t happen.
My doubts are :

  1. Should the cudaNormalizeRGBA API needs to be used ?
  2. In Jetson-inference repo, is the mean file which we load at the time of inference model creation, is really being used of is it just using the hard coded values → make_float3(104.0069879317889f, 116.66876761696767f, 122.6789143406786f) ? If no, can we find the values by ourselves ?
  3. Does the image (RGB/BRR) has some influence on the confidence values ?

Hi,

1. Sure.
Beside mean subtraction, it also reorder the data format from RGB RGB RGB to RRR GGG BBB.

2. We now subtract the identical value for each plane.
You can implement the by pixel mean subtraction here:
https://github.com/dusty-nv/jetson-inference/blob/master/imageNet.cu#L90

3. Please align to your training image format to have better prediction results.

Thanks.