TensorRT Keras Incorrect Output

I’m trying to run inference on my AGX Xaviar using TensorRT. However, importing the model from Keras via UFF gives the incorrect output.

I have a simple bit of code which does this, available here.

My example is MobileNetV2, and the predictions are incorrect, despite it getting the same images correct in Keras.

The inference code is just adapted from the official example /usr/src/tensorrt/samples/python/introductory_parser_samples/uff_resnet50.py, and I’ve verified that the label lookup table is correct.

The README in the example should have everything you need to understand how to run it.

I note I’m using JetPack 4.4.1, and this error occurs with both the 1.15 and 2.x versions of NVidia TensorFlow.


Could you share the sample for Keras inference as well?

In most case, the difference comes from the pre-processing of the input image.
Could you double check if all the preprocessing steps are aligned first?


Hi there, thanks for getting back to me. Keras inference with this model is with:

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image
import os
import numpy as np
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2 as Net

model = Net(weights='imagenet')

img_path = '/home/user/keras_tensorrt_inference/mobilenetv2/tabby_tiger_cat.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])

I hadn’t actually considered that the preprocessing steps could have impacted things like this, that fixes it, thanks!

I’ve ammended the function you reference, and the output is correct!

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
def load_normalized_test_case(test_image, pagelocked_buffer):
    # Converts the input image to a CHW Numpy array
    def normalize_image(image):
        # Resize, antialias and transpose the image to CHW.
        c, h, w = ModelData.INPUT_SHAPE
        x = np.asarray(image.resize((w, h), Image.ANTIALIAS)).transpose([2, 0, 1])
        x = preprocess_input(x)

        return x.astype(trt.nptype(ModelData.DTYPE)).ravel()

    # Normalize the image and copy to pagelocked memory.
    np.copyto(pagelocked_buffer, normalize_image(Image.open(test_image)))
    return test_image


Thanks for the sharing.

We are going to reproduce this in our internal environment first.
Will share more information with you later.


Sorry for the missing in my previous comment.
The root cause is TensorFlow tends to use NHWC format but TensorRT will automatically modify the model into NCHW format for performance.

It’s good to know it works now.


That’s useful to know, thanks.

Is there a whitepaper or something about the kinds of optimisations that TensorRT performs, and how it applies them?

I haven’t been able to find one. I’m interested about how it works.


Sorry for the late.

You can find some information in the below blog:

This blog is written at the early stage of TensorRT, and it mentions some core ideas of the TensorRT acceleration.