Inference of model using tensorflow/onnxruntime and TensorRT gives different result

Hi. I have a simple model which i trained using tensorflow. After that i converted it to ONNX and tried to make inference on my Jetson TX2 with JetPack 4.4.0 using TensorRT, but results are different.

That’s how i get inference model using onnx (model has input [-1, 128, 64, 3] and output [-1, 128]):

import onnxruntime as rt
import cv2 as cv
import numpy as np


sess = rt.InferenceSession("model_tf_float_opset10.onnx")
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name

im = cv.imread('cut10735.png')
im = cv.resize(im, (64, 128))

val_x = []
val_x.append(np.asarray(im).astype(np.float32))

print(label_name, input_name)
try:
    pred = sess.run([output], {input_name: val_x})[0]
    print(pred)
except Exception as e:
    print("Unexpected type")
    print("{0}: {1}".format(type(e), e))

That’s how i get inference with TensorRT:

IBuilder* builder = nvinfer1::createInferBuilder(gLogger);
const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
INetworkDefinition* network = builder->createNetworkV2(explicitBatch);
  
nvonnxparser::IParser* parser =
        nvonnxparser::createParser(*network, gLogger);
parser->parseFromFile("model_tf_float_opset10.onnx", (int)ILogger::Severity::kWARNING);
auto config = builder->createBuilderConfig();
config->setMaxWorkspaceSize(1 << 20);

auto profile = builder->createOptimizationProfile();
profile->setDimensions("images:0", OptProfileSelector::kMIN, Dims4{1, 128, 64, 3});
profile->setDimensions("images:0", OptProfileSelector::kOPT, Dims4{8, 128, 64, 3});
profile->setDimensions("images:0", OptProfileSelector::kMAX, Dims4{16, 128, 64, 3});

config->addOptimizationProfile(profile);

ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
int inputIndex = engine->getBindingIndex("images:0");
int outputIndex = engine->getBindingIndex("features:0");


cv::Mat img = cv::imread("cut10735.png");
cv::resize(img, img, cv::Size(64, 128));
img.convertTo(img, CV_32FC3);
cv::cuda::GpuMat output(128, 1, CV_32F);
cv::cuda::GpuMat gpu_dst(128, 64, CV_32FC3);
gpu_dst.upload(img);
Mat result(128, 1, CV_32F);

void* buffers[2];
buffers[inputIndex] = gpu_dst.data;
buffers[outputIndex] = output.data;

IExecutionContext *context = engine->createExecutionContext();
context->setBindingDimensions(0, Dims4{1, 128, 64, 3});
output.download(result);
context->executeV2(buffers);
  
output.download(result);
std::cout << result << std::endl;

What can be wrong here?
Here you can find my model and test image:
https://drive.google.com/drive/folders/1xEYcoQwOew-a74c6jpxDJtKu49NNNTgW?usp=sharing

Hi,

May I know which JetPack4.4 version do you use? DP or GA release?

There are some TensorRT issue fixed in our TensorRT GA release.
If you are using DP version, it’s recommended to give the latest JetPack a try first.

Thanks.

Hi. Thanks for your answer. I use Last (GA) release of JetPack.

Thanks for your feedback.

We are going to reproduce this on our environment.
Will update more information with you once we got any progress.

Thanks.

Hi,

Sorry for the late update.

It looks like you are using NHWC input format to feed into TensorRT.
However, TensorRT use NCHW format if no special data format specified.

Would you mind to convert the data into NCHW format and try it again?

Thanks.