Onnx to trt and use int8 for inference, with batchsize=8. Got ERROR:genericReformat.cu (1262)

Amy_21 · April 30, 2021, 9:44am

Description

1 get resnet R100 explicit batch onnx modle and one batchsize onnx model
2 get a calibrator by using the one batchsize onnx model
3 get a dynamic batchsize trt model by using the cache and explicit batch onnx modle
4 inference with dynamic batchsize trt model ,command:python3 infer.py -e …/int8/calibration/model.engine

During 4 step, 1 batchsize trt inference ok, but 8 batchsize trt inference got the ERROR below:

[TensorRT] ERROR: ../rtSafe/cuda/genericReformat.cu (1262) - Cuda Error in executeMemcpy: 1 (invalid argument) [TensorRT] ERROR: FAILED_EXECUTION: std::exception

I just modified the infer.py :

context.active_optimization_profile = 1

see GitHub - rmccorm4/tensorrt-utils: ⚡ Useful scripts when using TensorRT

Environment

TensorRT Version: 7.0
GPU Type: DGPU
Nvidia Driver Version : 440.44
CUDA Version : 10.2
CUDNN Version : 7.6
Operating System + Version : Deepstream 5.0 container
Python Version (if applicable) : 3.6
Baremetal or Container (if container which image + tag) : nvcr.io/nvidia/deepstream :5.0.1-20.09-triton

Relevant Files

spolisetty · May 5, 2021, 5:57am

Hi @Amy_21,

Hope this will help you,

github.com/NVIDIA/TensorRT

../rtSafe/cuda/genericReformat.cu (1262) - Cuda Error in executeMemcpy: 11

opened 07:17AM - 09 Mar 20 UTC

closed 02:41AM - 10 Mar 20 UTC

oyrq

API: C++ Release: 7.x

## Description I use dynamic shape input for onnx model in tensorRT, but I ha…ve a problem when run executeV2() bool status = mPreprocessorContext->executeV2(preprocessorBindings.data()); error: [E] [TRT] ../rtSafe/cuda/genericReformat.cu (1262) - Cuda Error in executeMemcpy: 11 (invalid argument) [E] [TRT] FAILED_EXECUTION: std::exception [E] [TRT] engine.cpp (182) - Cuda Error in ~ExecutionContext: 77 (an illegal memory access was encountered) [E] [TRT] INTERNAL_ERROR: std::exception [E] [TRT] Parameter check failed at: ../rtSafe/safeContext.cpp::terminateCommonContext::165, condition: cudaEventDestroy(context.start) failure. [E] [TRT] Parameter check failed at: ../rtSafe/safeContext.cpp::terminateCommonContext::170, condition: cudaEventDestroy(context.stop) failure. [E] [TRT] ../rtSafe/safeRuntime.cpp (32) - Cuda Error in free: 77 (an illegal memory access was encountered) terminate called after throwing an instance of 'nvinfer1::CudaError' Thanks! ## Environment **TensorRT Version**: 7.0.0.11 C++ API **GPU Type**: 2080 **Nvidia Driver Version**: 418.56 **CUDA Version**: 10.0 **CUDNN Version**: 7.6.5.32 **Operating System + Version**: ubuntu 16.04

Thank you.

NVES · May 5, 2021, 6:36am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!