TensorRT inference gives higher scores than Python model, is it normal?

Hi all,

I finished writing my C++ TensorRT engine, and was doing some tests on it. My steps were:

  1. Downloaded a trained TensorFlow model, froze it to .pb, and converted it to .uff format. The model I found is from this website: https://www.cs.toronto.edu/~frossard/post/vgg16/
  2. In my code, I parsed the model and did the inference on the same picture the website provided. Since I was using OpenCV to load the image, I deliberately converted BGR to RGB since BGR was the default order for OpenCV. Also, I made sure that the dimensional order was CHW, as requested by TensorRT.
  3. I didn’t use FP16 or INT8.

So here’s the weird thing: the python code from the model provider (which I view as reference) gives out 0.693 probability of the image being a weasel, while my C++ TensorRT code gives out 0.762. I was hoping the two could be very similar, but this is way too different.

Is it due to BGR2RGB? I disabled bgr2rgb functions in my TensorRT code, and the result drops to 0.3811. So clearly bgr2rgb is needed.

I also tried another image I pulled from Internet, the python model gives me 0.9982, TensorRT gives me 0.9999.

From the results TensorRT gives me very good outcome, but should I be worried about this gap between values?

Hi jq2250,

For FP32, the accuracy should remain the same on a per-image basis, assuming the model was converted correctly. Do you mind sharing a small package of code to reproduce this issue so we can further investigate?

Thanks,
NVIDIA Enterprise Support