Softmax layer in Tensorrt7.0 has wrong inference results

Hi, I found there is something wrong with softmax layer in trt7.0.

My environment lists as bellow:
o Linux distro and version : ubuntu 16.0 LTS
o GPU type :GeForce GTX 1660
o Nvidia driver version :440.36
o CUDA version :10.2
o CUDNN version: libcudnn7-dev_7.6.5.32-1
o Python version [if using python]: python2.7
o Tensorflow version tensorflow1.14
o TensorRT version: TensorRT-

My tensorflow model model.pb is generated as bellow:

from tensorflow.python.framework import graph_util
import tensorflow as tf

image = tf.placeholder(tf.float32, [1, 224, 224, 3], name="input_image")
logits = tf.layers.conv2d(image, 3, kernel_size=1, padding='SAME')
pred = tf.nn.softmax(logits)
pred = tf.identity(pred, name = "pred")

init = tf.global_variables_initializer()
sess = tf.Session()

# We use a built-in TF helper to export variables to constant
output_graph_def = graph_util.convert_variables_to_constants(
    sess, sess.graph_def,

# Finally we serialize and dump the output graph to the filesystem
with tf.gfile.GFile("./model.pb", "wb") as f:

Then employ convert-to-uff to transform model.pb to model.uff.

Next use the C++ APIs and libs(trt7.0) to generate model.plan and do inference, but we get the wrong result. At the same time, we use libs(trt6.0) to generate plan model and do inference, and the result is right.

Could somebody tell me what’s going wrong?



Can you share a small script (preferably python, but C++ is fine) to do inference on your sample TRT engine, and compare the outputs of your TRT6 and TRT7 engines?

I’m seeing the same outputs from TF, TRT6, and TRT7 given the same input with an internal tool, for both TF -> UFF -> TRT and TF -> ONNX -> TRT.

As a side note, the UFF parser will be deprecated in the future per the TRT7 release notes, so ONNX parser gets more support / new features / etc. at the moment. You could convert your sample model above to ONNX with tf2onnx like so:

python3 -m pip install tf2onnx
python3 -m tf2onnx.convert --input model.pb --inputs "input_image:0" --outputs "pred:0" --output "model.onnx"

Hi, a test case in C++ to do inference on my sample TRT engine is in the attachment. We can’t get right results from trt7 engine.

Attachment trt_engine.tar - too big

Hi, Could you reproduce the results and figure out where the problem is?

I am looking forward to your early reply.



Sorry for the delay. I was able to repro your issue in a couple different ways, so I just passed this to the engineering team to take a look at. Stay tuned.

Hi @maoxuehui1125,

It seems the results from UFF parser on this simple model are actually incorrect for both TRT 6.0 and 7.0, it may have been a fluke that they appear correct in TRT 6.0 in your sample script - however, since the deprecation of the UFF parser, there is currently no plan to fix this.

I converted this model to ONNX and confirmed the outputs are consistent with TensorFlow, ONNX, and TensorRT 7.0.

Please migrate your workflow to using the ONNX parser, you can use tf2onnx to convert your TF models to ONNX.

Here’s the ONNX model for this simple example (559 Bytes)

Will this bug be fixed? Because it seems that onnx parser don’t support custom plugin now