TRT8 engine creation from onnx fails due to AssertionError

lttazz99 · July 27, 2021, 10:38pm

Description

I’m trying to convert a HuggingFace Transformers model to a TRT engine but this fails with the error:

[07/21/2021-20:51:52] [V] [TRT] --------------- Timing Runner: {ForeignNode[945 + (Unnamed Layer* 74) [Shuffle]…Add_564]} (Myelin)
trtexec: /dvs/p4/build/sw/rel/gpgpu/MachineLearning/myelin_trt8/src/compiler/…/./compiler/kernel_gen/kernel_gen_utils.hpp:190: myelin::kgen::dag_vertex_t::operand_t myelin::kgen::{anonymous}::red_partial_operand(const myelin::kgen::dag_vertex_t*): Assertion `rvtx->is_lowered_op()’ failed.

Transformers version: 4.8.2

I’m using the TRT OSS build container for converting the onnx model to TRT. I built the 8.0.1 tag of this repo using the instructions provided in the README.md file.

NOTE: TRT engine creation for the same model using TRT 7.2.3.4 succeeds.

Environment

TensorRT Version: 8.0.1.6
GPU Type: RTX 3080 Laptop
Nvidia Driver Version: 465.31
CUDA Version: 11.3
CUDNN Version: 8.2.0
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.9.0+cu111
Baremetal or Container (if container which image + tag):

Relevant Files

Model: sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco · Hugging Face

Steps To Reproduce

One line in one transformers file needs to be modified because TRT doesn’t support bool data type in the expand operator.
In <PYTHON_LIBS>/site-packages/transformers/models/distilbert/modeling_distilbert.py change line 183 from:

mask = (mask == 0).view(mask_reshp).expand_as(scores) # (bs, n_heads, q_length, k_length)

to:

mask = (mask == 0).int().view(mask_reshp).expand_as(scores).bool() # (bs, n_heads, q_length, k_length)

Then run the code to download model and convert to onnx:

import torch.onnx
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(“sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco”).cuda().eval()
tokenzier = AutoTokenizer.from_pretrained(“sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco”)
sample = {k: v.cuda() for k, v in tokenzier([“Example1”, “Slightly longer example2!”] * 16, return_tensors=‘pt’, truncation=True, padding=‘max_length’, max_length=512).items()}
torch.onnx.export(model, (sample[‘input_ids’], sample[‘attention_mask’]), “distilbert_dot_tas_b.onnx”, verbose=True, opset_version=13)

Run trtexec as follows:

./trtexec --onnx=/workspace/models/distilbert_dot_tas_b.onnx --saveEngine=engine.trt --explicitBatch --verbose --workspace=4096

NVES · July 27, 2021, 11:07pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

lttazz99 · July 28, 2021, 1:10pm

Thanks for getting back!
Here’s the model: https://drive.google.com/file/d/17GHDzllxK12bPSZPNBrFxldbiCc_PMV1/view?usp=sharing
The script is mentioned in the original post:

Then run the code to download model and convert to onnx:

In the meantime, I’ll give the onnx checker a try and revert with what I find.

spolisetty · July 28, 2021, 4:13pm

Hi @lttazz99,

We are unable to reproduce this issue on Tesla V100 GPU. We will try to run on RTX GPU.
Meanwhile could you please share us trtexec --verbose logs for better debugging.

Thank you.

lttazz99 · July 28, 2021, 7:34pm

Thanks for getting back @spolisetty. I apologize, I provided the wrong model. Here’s the correct link: https://drive.google.com/file/d/1wVvzfS4bZMxNOEuSkimRWhpWGhcm50ge/view?usp=sharing

I’ll attach the verbose logs soon.

lttazz99 · July 30, 2021, 7:02pm

@spolisetty PFA the verbose logs. I also ran onnx checker as suggested in the first reply and it didn’t throw any errors.
TRT8_verbose_logs.txt (449.1 KB)

spolisetty · August 1, 2021, 2:19pm

Hi @lttazz99,

Thank you for sharing the model, we could reproduce the error. Please allow us some time to work on this.

spolisetty · August 3, 2021, 7:38am

@lttazz99,

This is known issue, which will be resolved in future releases.

Thank you.

user10264 · June 7, 2022, 7:13pm

@spolisetty
do you mind sharing more details about this known issue, please?
it’d be helpful for other audiences who are debugging similar issues.