Got Assertion `sI.count() == 1' failed. when create engine with INT8 calibration

Description

Hi,
I try to convert an caffe model to ICudaEngine and do the tensort inference.
at begining,I created network definition with flag:trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH, the engine is created and the result of inference is normally(with api: context.execute_async_v2()). then I add a calibrator to create an Int8 model. It still works well.
since using flag:trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH cannot do batch inference(our caffe model without dynamic shape), I try to create network definition with flag:trt.NetworkDefinitionCreationFlag.EXPLICIT_PRECISION. it also works well and can do batch inference(with api execute_async). But when I set a calibrator to create an int8 model,i get the error:

…/builder/cudnnBuilderWeightConverters.cpp:163: std::vector nvinfer1::cudnn::makeConvDeconvInt8Weights(nvinfer1::ConvolutionParameters&, const nvinfer1::rt::EngineTensor&, const nvinfer1::rt::EngineTensor&, float, bool, bool): Assertion `sI.count() == 1’ failed.

what’s that meaning? and what case this error?

Environment

TensorRT Version: 7.0.0
GPU Type: gtx1070
Nvidia Driver Version: 455.45.01
CUDA Version: 11.1
CUDNN Version: 7.6
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi, Please refer to the below links to perform inference in INT8

Thanks!

this repo I had read, Since It use deplicated api:


so It cannot solve my question:

I also has read the pdf, the pdf is too old (trt version 2.1) to give me any useful infomation.

Hi @529683504,

The explict_precision is for QAT, we cannot set int8 calibrator on network that create with explicit_precision. For your reference,
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/Builder.html#networkdefinitioncreationflag
We request you to move to ONNX, caffe parser is deprecated and has no dynamic shape support.

Thank you.

hi spolisetty,

thanks for your response!

1 Like