When converting BERT onnx to TensorRT engine, get different num_layers

Description

The code for convertion is:

def build_engine(model_file, max_ws=512 * 1024 * 1024, fp16=True):
    print("building engine")
    TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
    builder = trt.Builder(TRT_LOGGER)
    builder.fp16_mode = fp16
    config = builder.create_builder_config()
    config.max_workspace_size = max_ws
    if fp16:
        config.flags |= 1 << int(trt.BuilderFlag.FP16)
    explicit_batch = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    network = builder.create_network(explicit_batch)
    with trt.OnnxParser(network, TRT_LOGGER) as parser:
        with open(model_file, 'rb') as model:
            parsed = parser.parse(model.read())
            print("network.num_layers", network.num_layers)
            last_layer = network.get_layer(network.num_layers - 1)
            #network.mark_output(last_layer.get_output(0))
            engine = builder.build_engine(network, config=config)
    return engine

The output of the num_layer is:
print("network.num_layers", network.num_layers)
image
But the engine shows only 579 num_layers:

And when I tried to infer with the engine, the result was also different from the pytorch model result.(pytorch result is the same as the onnx runtime result and both are correct)

The “trtexec” command

trtexec --explicitBatch --onnx=bert_batch_1_sim.onnx --saveEngine=bert.engine

gave the same result as the “build_engine” function

More information:

trtexec warning logs:



some information about the engine(get from trtexec). The information looks good but the inference result is wrong.

Environment

TensorRT Version: 7.0
GPU Type: Tesla T4
Nvidia Driver Version: 410.104
CUDA Version: 10.2
CUDNN Version: 7.6.5.32
Operating System + Version: Linux(Docker container)
Python Version (if applicable): 3.6
PyTorch Version (if applicable): 1.7
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:20.03-py3

Relevant Files

ONNX file

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi, Request you to share the ONNX model and the script so that we can assist you better.

Alongside you can try validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).

Alternatively, you can try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks!

Thanks, I’m uploading my onnx converting code along with my onnx file and tensorrt engine file right now

I have uploaded my onnx file here:
ONNX file

I have tried the trtexec command but still got 579 num_layers and wrong inference result.

My inference code is from here:
[tensorrt-utils/infer.py at 493aa3827ff2c9886436ee4cbe60fed79d5bd263 · rmccorm4/tensorrt-utils · GitHub]

Also, the pytorch model and onnx model outputs are the same below :
image
However, the tensorRT engine output is:


Thanks!

Hi, I tried check_model.py but get no output

Hi @lyzs1225,

We recommend you to try latest tensorrt 7.2.x.
We also have a tool polygraphy to do the comparsion between trt and onnxruntime. This might help you.

Thank you.

Hi @spolisetty ,
I tried tensorRT 7.2 in container: nvcr.io/nvidia/tensorrt:20.12-py3 but still got the wrong answer, could you help me to determine why it happened?
The onnx file is the same file.

And how to get the camparison tool?

We also have a tool polygraphy to do the comparsion between trt and onnxruntime.

Thank you!

Hi @spolisetty ,
Any progress on how to fix this problem?
Thanks.

Hi @lyzs1225,

Please try polygraphy tool for comparsion between trt and onnxruntime. Which helps you for debugging. For your reference,
https://docs.nvidia.com/deeplearning/tensorrt/polygraphy/docs/index.html
https://github.com/NVIDIA/TensorRT/tree/master/tools/Polygraphy

Thank you.

Hi @spolisetty ,

The polygraphy tool shows the same wrong results as I got before.

Hi @spolisetty ,
Could you help me to find why the trt inference get different answer?

Hi @lyzs1225,

Please allow us sometime. We are looking into this issue.

Thanks

Hi @spolisetty ,
Any progress on this?