Unexpected output with onnx-TensorRT

I’ve got some problems to transform from a TensorFlow model, .h5 formate, to a TensorRT engine.

In particular, we first transform the h5 model to onnx formate by the following python script

    import tensorflow as tf
    import tf2onnx.convert
    import onnx
    model = tf.keras.models.load_model(<h5 model>)
    onnx_model, _ = tf2onnx.convert.from_keras(model)
    BATCH_SIZE = <batch_size>
    inputs = onnx_model.graph.input
    for input in inputs:
        dim1 = input.type.tensor_type.shape.dim[0]
        dim1.dim_value = BATCH_SIZE
    onnx.save_model(onnx_model, <model_name>)

The codes from above excerpts from

And next, we simplify the onnx model by executing the command as follows

    python3 -m onnxsim <onnx_model> <onnx_model>_sim.onnx

Where the onnxsim transformation is due to the INT64 problem, while the onnxsim package is downloaded from


Finally, we execute trtexec to convert the onnx model to a TensorRT engine

    trtexec --onnx=<onnx_simplified_model> --saveEngine=<trt_engine.trt> --explicitBatch

The above processes are worked well in our Jetson Xavier NX platform. However, when we attempt to run the TensorRT engine for inference our data, the problem occurs.

In detail, our h5 model is a custom Yolov4 model where the output is formatted by

    [(batch_size, 64, 64, 3, 12),
     (batch_size, 32, 32, 3, 12),
     (batch_size, 16, 16, 3, 12)]

However, the TensorRT engine gives


which is in the opposite order. Furthermore, the output value is also inconsistent.

I cannot figure out which step is incorrect, and I’d appreciate some help.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.


Could you try if you can get the correct output without running simplification?