TensorRT Engine gives incorrect inference output for segmentation model

sandeep.ganage · January 6, 2021, 6:54am

Hello.
I am trying to convert tf-keras segmentation model to tensorrt engine and perform inference on it. The tensorrt engine gives incorrect/noisy result output. I have tried porting classification model and it works. The problem is with tensorflow models only.

Following are the steps I followed:

Convert the model to ONNX model using tf2onnx using following command:

python -m tf2onnx.convert --saved-model "Path_To_TF_Model" --output “Path_To_Output_Model\Model.onnx” --verbose
I performed inference on this onnx model using onnxruntime in python. It gives correct output.

This ONNX model I converted to TensorRT engine using following command:
trtexec.exe --onnx=“Patrh_To_ONNX_Model\model.onnx” --saveEngine=“PAth_To_TRT_Engine\model.engine” --verbose --explicitBatch
I performed inference on this tensorrt engine in c++. Inference outputs are all wrong and noisy.

There are no errors or warnings in the verbose output while generating the engine. In fact the onnx model inference gives the correct output. So I am confused now.

I can share my code, verbsose outputs, tf and onnx models as well.

Platform: Windows 10
CUDA : 11.1
TensorRT version : 7.2.2
CuDNN version : 8.0.4

Please let me know if you need any other input from my side.

NVES · January 6, 2021, 7:09am

Hi, Request you to share your model and script, so that we can help you better.

Alternatively, you can try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks!

sandeep.ganage · January 6, 2021, 8:06am

@NVES
Hello,
Sharing you paths to all the files I am using:
TF_Keras Model - https://drive.google.com/file/d/1SEMiNhG6XibA5dRAv3UvBfWyY-UoU0JA/view?usp=sharing

ONNX Model - https://drive.google.com/file/d/1DfNDFsyVBBbNrEsvkEGaiCYxGYzo-bCM/view?usp=sharing

Input Image - Input.png - Google Drive

Expected output - ExpectedOutput.png - Google Drive

Python code to run onnxruntime :

        tile = cv2.imread(tileName)
        tile = tile.astype("float32") / 255

        onnx_model = onnx.load("PathToONNX_Model/tf2onnx.onnx ")

        content = onnx_model.SerializeToString()
        session = onnxruntime.InferenceSession(content)
        io_binding = session.io_binding()
        io_binding.bind_cpu_input('zero_padding2d_20_input:0', data)
        io_binding.bind_output('Identity:0')
        session.run_with_iobinding(io_binding)

        prob = io_binding.copy_outputs_to_cpu()[0]
        prediction = np.uint8(np.argmax(prob[0], axis=1))
        prediction = np.uint8((255 * prediction.astype('float32')) / (classes - 1))
        prediction = np.reshape(prediction, (imheight, imwidth))

        cv2.imwrite(dataPaths + tileNum + '_onnxOp11Out' + labelExt, prediction)
        print('Done : ' + dataPaths + resultDir + tileNum + labelExt)

My CPP Code for Inference:
TRT_Inference.cpp (6.1 KB)

spolisetty · January 7, 2021, 11:16am

Hi @sandeep.ganage,

Could you please try by changing following code block in TRT_Inference.cpp

auto input_width = dims.d[2];
auto input_height = dims.d[1];
auto channels = dims.d[0];
auto input_size = cv::Size(input_width, input_height);

to

auto input_width = dims.d[3];
auto input_height = dims.d[2];
auto channels = dims.d[1];
auto input_size = cv::Size(input_width, input_height);

Thank you.

sandeep.ganage · January 8, 2021, 10:40am

@spolisetty
Actually for caffe model I used this same index sequence and it worked. But yes, that was a mistake in this case.

I updated the index sequence but the inference output is still incorrect.

sandeep.ganage · January 10, 2021, 3:49pm

Hello @spolisetty
I made those changes. Also I found out that there was additional softmax operation other than model’s softmax operation. I removed that code and rerun the code. It is still giving me incorrect inference result.

Please find my updated code:
NVIDIA_Dev_Forum.cpp (6.9 KB)

spolisetty · October 12, 2021, 8:39am

Hi,

The issue is in user’s code.

Apply this diff we can get the correct result:

121c121
<               if (cpu_output[ii*2] >= cpu_output[ii*2+1]) {
---
>               if (cpu_output[ii] >= cpu_output[ii+(512*512)]) {

The output is 1x262144x2 rather than 1x2x262144.

Thank you.

Topic		Replies	Views
ONNX Model and Tensorrt Engine gives different output TensorRT tensorrt , onnx	13	5752	June 29, 2022
TensorRT model inference result is not correctly TensorRT tensorrt , tensorflow , onnx	1	712	July 1, 2022
Wrong result in TensorRT, but it seems something is working correctly TensorRT tensorrt , onnx	9	2718	September 29, 2022
TensorRT incorrect predictions for segmentation JAX tensorrt	2	882	April 12, 2023
TensorRT Segmentation output TensorRT tensorrt , cudnn , onnx	1	421	March 14, 2024
Segmentation fault (core dumped) in deit-base-distilled-patch16-224 model TensorRT	1	492	September 4, 2023
Inference result gets worse when converting pytorch model to TensorRT model TensorRT pytorch	6	1339	January 19, 2022
Onnx -> tensorrt fp32 conversion performance degradation different outputs TensorRT tensorrt , pytorch , onnx	4	2305	November 29, 2022
Incorrect inference results after converting from ONNX to TRT with trtexec TensorRT tensorrt , python , onnx	4	1747	December 9, 2022
Differences between tensorflow model inference and tensorRT model inference TensorRT tensorrt , tensorflow	6	2087	September 14, 2022

TensorRT Engine gives incorrect inference output for segmentation model

Related topics