Onnx to trt and use int8 for inference, with batchsize=8. Got ERROR:genericReformat.cu (1262)


1 get resnet R100 explicit batch onnx modle and one batchsize onnx model
2 get a calibrator by using the one batchsize onnx model
3 get a dynamic batchsize trt model by using the cache and explicit batch onnx modle
4 inference with dynamic batchsize trt model ,command:python3 infer.py -e …/int8/calibration/model.engine

During 4 step, 1 batchsize trt inference ok, but 8 batchsize trt inference got the ERROR below:

[TensorRT] ERROR: ../rtSafe/cuda/genericReformat.cu (1262) - Cuda Error in executeMemcpy: 1 (invalid argument) [TensorRT] ERROR: FAILED_EXECUTION: std::exception

I just modified the infer.py :

context.active_optimization_profile = 1

see GitHub - rmccorm4/tensorrt-utils: ⚡ Useful scripts when using TensorRT


TensorRT Version: 7.0
Nvidia Driver Version : 440.44
CUDA Version : 10.2
CUDNN Version : 7.6
Operating System + Version : Deepstream 5.0 container
Python Version (if applicable) : 3.6
Baremetal or Container (if container which image + tag) : nvcr.io/nvidia/deepstream :5.0.1-20.09-triton

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet


import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging