I’ve got some problems to transform from a TensorFlow model, .h5 formate, to a TensorRT engine.

In particular, we first transform the h5 model to onnx formate by the following python script

```
import tensorflow as tf
import tf2onnx.convert
import onnx
model = tf.keras.models.load_model(<h5 model>)
onnx_model, _ = tf2onnx.convert.from_keras(model)
BATCH_SIZE = <batch_size>
inputs = onnx_model.graph.input
for input in inputs:
dim1 = input.type.tensor_type.shape.dim[0]
dim1.dim_value = BATCH_SIZE
onnx.save_model(onnx_model, <model_name>)
```

The codes from above excerpts from

https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html#export-from-tf

And next, we simplify the onnx model by executing the command as follows

```
python3 -m onnxsim <onnx_model> <onnx_model>_sim.onnx
```

Where the onnxsim transformation is due to the INT64 problem, while the onnxsim package is downloaded from

.

Finally, we execute trtexec to convert the onnx model to a TensorRT engine

```
trtexec --onnx=<onnx_simplified_model> --saveEngine=<trt_engine.trt> --explicitBatch
```

The above processes are worked well in our Jetson Xavier NX platform. However, when we attempt to run the TensorRT engine for inference our data, the problem occurs.

In detail, our h5 model is a custom Yolov4 model where the output is formatted by

```
[(batch_size, 64, 64, 3, 12),
(batch_size, 32, 32, 3, 12),
(batch_size, 16, 16, 3, 12)]
```

However, the TensorRT engine gives

```
[(batch_size*16*16*3*12,),
(batch_size*32*32*3*12,),
(batch_size*64*64*3*12,)]
```

which is in the opposite order. Furthermore, the output value is also inconsistent.

I cannot figure out which step is incorrect, and I’d appreciate some help.