Onnx to trt and use int8 for inference, with batchsize=8. Got ERROR:genericReformat.cu (1262)


1 get resnet R100 explicit batch onnx modle and one batchsize onnx model
2 get a calibrator by using the one batchsize onnx model
3 get a dynamic batchsize trt model by using the cache and explicit batch onnx modle
4 inference with dynamic batchsize trt model ,command:python3 infer.py -e …/int8/calibration/model.engine

During 4 step, 1 batchsize trt inference ok, but 8 batchsize trt inference got the ERROR below:

[TensorRT] ERROR: ../rtSafe/cuda/genericReformat.cu (1262) - Cuda Error in executeMemcpy: 1 (invalid argument) [TensorRT] ERROR: FAILED_EXECUTION: std::exception

I just modified the infer.py :

context.active_optimization_profile = 1

see GitHub - rmccorm4/tensorrt-utils: ⚡ Useful scripts when using TensorRT


TensorRT Version: 7.0
Nvidia Driver Version : 440.44
CUDA Version : 10.2
CUDNN Version : 7.6
Operating System + Version : Deepstream 5.0 container
Python Version (if applicable) : 3.6
Baremetal or Container (if container which image + tag) : nvcr.io/nvidia/deepstream :5.0.1-20.09-triton

Relevant Files

Hi @Amy_21,

Hope this will help you,

Thank you.

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet


import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging