Tenssorrt INT8 precision engine build failed for the models having custom layer (BatchedNMSDynamic_TRT)

Description

I have a set of object detection models (backbone: Hardnet. Head: SSD, YOLO, Centernet) in ONNX. For all these models I’m able to build FP32/16 engines. But with INT8 only centernet head is getting build, other SSD and YOLO head is getting failed with the below error:

 [TensorRT] VERBOSE: *************** Autotuning format combination: Int8(1,128,16384:4,589824) -> Int8(1,128,16384:32,65536) ***************
[TensorRT] INTERNAL ERROR: Assertion failed: std::all_of(std::begin(hostScales) + cStart, std::begin(hostScales) + cStart + cExtent, [scale](float x) { return x == scale; })
helpers.cpp:570
Aborting...
[TensorRT] VERBOSE: Builder timing cache: created 350 entries, 36 hit(s)
[TensorRT] ERROR: helpers.cpp (570) - Assertion Error in scaleInt8ByTensor: 0 (std::all_of(std::begin(hostScales) + cStart, std::begin(hostScales) + cStart + cExtent, [scale](float x) { return x == scale; }))
Traceback (most recent call last):

The only difference between SSD, YOLO and CenterNet is that the later doesn’t have any custom layer such as NMS.

Environment

TensorRT Version: 7.1
GPU Type: 2080Ti
Nvidia Driver Version:
CUDA Version: 11.2
CUDNN Version:
Operating System + Version: Ubuntu 16.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.7
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:21.03-py3

Relevant Files:

ONNX file: https://drive.google.com/file/d/1P-wouXBoE7pcDU62-7cNkp11OUZVTHqs/view?usp=sharing
Verbose of building engine: VERBOSE.log - Google Drive

Hi @sksenthilkumar93m,

We recommend you to please try on latest TensorRT version. We could not reproduce this issue when we tried on TensorRT 8.0 EA version using trtexec.

&&&& PASSED TensorRT.trtexec [TensorRT v8000] # trtexec --onnx=./yolov2.onnx --workspace=3000 --verbose --saveEngine​=yolov2.trt
[06/10/2021-11:26:53] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1206, GPU 1412 (MiB)

Thank you.

Hi. Thanks for the reply. After you mentioned I checked with trtexec it works for me also. But with Python API it is not working and I need this to work with Python API.

I have attached the basic code that can be used to regenerate this error:

import argparse
import pathlib
import tensorrt as trt

ap = argparse.ArgumentParser()
ap.add_argument("--onnx_file", required=True)
ap.add_argument("--verbose", action='store_true')
ap.add_argument("--calib_file",
                help="(INT8 ONLY) Already created calib file or path to save the calib file",
                default=None)


class PythonEntropyCalibrator(trt.IInt8EntropyCalibrator2):
    def __init__(self, cache_file):
        trt.IInt8EntropyCalibrator2.__init__(self)

        self.cache_file = cache_file
        self.batch_size = 1

    def get_batch_size(self):
        return self.batch_size

    def read_calibration_cache(self):
        # If there is a cache, use it instead of calibrating again. Otherwise, implicitly return None.
        if pathlib.Path(self.cache_file).exists():
            with open(self.cache_file, "rb") as f:
                return f.read()


def build_engine(onnx_file_path,
                 calibrator=None,
                 trt_logger=trt.Logger(trt.Logger.VERBOSE)):
    # initialize TensorRT engine and parse ONNX model
    print(onnx_file_path)
    builder = trt.Builder(trt_logger)
    ncf = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    network = builder.create_network(ncf)
    parser = trt.OnnxParser(network, trt_logger)

    # parse ONNX
    with open(onnx_file_path, 'rb') as model:
        print('Beginning ONNX file parsing')
        parser.parse(model.read())
    print('Completed parsing of ONNX file')
    config = builder.create_builder_config()
    config.max_workspace_size = 2 << 30

    builder.max_batch_size = calibrator.get_batch_size()
    assert calibrator is not None, "Please pass an object of trt.IInt8EntropyCalibrator2 as calibrator"
    config.int8_calibrator = calibrator
    config.set_flag(trt.BuilderFlag.INT8)
    config.set_flag(trt.BuilderFlag.STRICT_TYPES)
        # builder.int8_calibrator = calibrator

    print('Building an engine...')
    engine = builder.build_engine(network, config)
    context = engine.create_execution_context()
    print("Completed creating Engine")

    return engine, context


def get_onnx_file_name(onnx_file):
    return pathlib.Path(onnx_file).stem


def get_onnx_parent_directory(onnx_file):
    return pathlib.Path(onnx_file).parent


def main(onnx_file, verbose=False, int8_calibrator=None):

    if verbose:
        trt_logger = trt.Logger(trt.Logger.VERBOSE)
    else:
        trt_logger = trt.Logger(trt.Logger.WARNING)

    engine, context = build_engine(onnx_file, int8_calibrator, trt_logger=trt_logger)

    return engine


if __name__ == '__main__':
    args = ap.parse_args()
    calib = None

    if args.calib_file is None:
        cache_file_name = f"{get_onnx_file_name(args.onnx_file)}.cache"
    else:
        cache_file_name = args.cache_file

    cache_full_path = get_onnx_parent_directory(args.onnx_file).joinpath(cache_file_name)
    calib = PythonEntropyCalibrator(cache_full_path)

    main(args.onnx_file, args.verbose, int8_calibrator=calib)

Needed files:
ONNX file: https://drive.google.com/file/d/1P-wouXBoE7pcDU62-7cNkp11OUZVTHqs/view?usp=sharing
Cache file: yolov2.cache - Google Drive

OUTPUT that I got:
Verbose of building engine: VERBOSE.log - Google Drive

Please let me know if you get the same error. If so how to solve it?

NOTE: I’m working with Tensorrt 7.2 because that is the latest version available as docker images.

Hi @sksenthilkumar93,

Thank you for sharing the files. We tried reproducing the error. Identified that ONNX model is invalid.
we can check if onnx model is valid using,

import sys
import onnx
filename = "./yolov2.onnx"
model = onnx.load(filename)
onnx.checker.check_model(model)

Test model

Traceback (most recent call last):
File “check.py”, line 4, in
model = onnx.load(filename)
File “/usr/local/lib/python3.8/dist-packages/onnx/init.py”, line 121, in load_model
model = load_model_from_string(s, format=format)
File “/usr/local/lib/python3.8/dist-packages/onnx/init.py”, line 158, in load_model_from_string
return _deserialize(s, ModelProto())
File “/usr/local/lib/python3.8/dist-packages/onnx/init.py”, line 99, in _deserialize
decoded = cast(Optional[int], proto.ParseFromString(s))
google.protobuf.message.DecodeError: Error parsing message

Using trtexec,

[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Invalid control characters encountered in text.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:3: Expected identifier, got: :
[06/15/2021-09:10:47] [E] [TRT] ModelImporter.cpp:700: Failed to parse ONNX model from file: ./yolov2.onnx
[06/15/2021-09:10:47] [E] Failed to parse onnx file
[06/15/2021-09:10:47] [E] Parsing model failed
[06/15/2021-09:10:47] [E] Engine creation failed
[06/15/2021-09:10:47] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8000] # trtexec --onnx=./yolov2.onnx --verbose --saveEngine=g1.trt

If you need further assistance on generating correct ONNX model we recommend you to post your concern here to get better help.

Thank you.

Thanks for the reply. If the ONNX file is corrupted then how come it worked on your first try and also without the FP16 tag?

@sksenthilkumar93,

Looks like you modified the onnx model, I tried with both onnx models.
Recent one seems to be invalid onnx file.
With previous model I am facing different errors than you mentioned when we run using script you shared(without cache file). We recommend you to please install latest TensorRT version, generate valid ONNX model and if you face any issues please share new issue reproducible model/scripts.

Thank you.

I’ll definitely try it but Can’t we fix this for Tensorrt 7.2? because for Tensorrt 8 docker is not yet available and I need to run my applications in Nvidia-dockers and also in jetson devices as far as I know they take a lot more time to support Tensorrt 8 there. I need it now for a research paper I’m working on. Thanks.

@sksenthilkumar93,
You can test on 7.2. We would help on this if there is a workaround.

Thanks. I get the same error for Tensorrt 7.2 as well.

@sksenthilkumar93,

As mentioned in my previous reply, could you please share valid ONNX model with issue repro for better assistance. Please make sure your ONNX model is working correctly with onnx-runtime.

Thank you.

This one is working for me. Please let me know if you have a problem again.

https://drive.google.com/file/d/1ZPTDXcDSR1eiGyaZNSdwor4qQrRCl_si/view?usp=sharing

@sksenthilkumar93,

We are getting different error related to plugin, when we try on TRT 7.2,
When we try on TRT 8.0, we are unable to reproduce the error, able to successfully run the script. Currently we do not have a simple workaround for this issue to run on TRT 7.2

We recommend you to please wait for few days till we get TRT 8.0 container. or you can setup 8.0 EA locally or on NGC container by following installation guide.

Thank you.