[executionContext.cpp::executeInternal::652] Error Code 1: Cuda Runtime (an illegal memory access was encountered) | Cuda failure: 700

Description

Inference core dumped with multiple execution contexts parallel.
Model type is onnx dynamic shape. I created same count profiles with execution contexts, and for each execution context, called context->setOptimizationProfile(i) before inference. From the log out, you can see the binding index for each profile and context is correct, but I never made the inference success.
Log shows illegal memory access was encountered. Then, I checked the device buffer and host buffer, both of them were allocated correct memory size. I cannot find any clue as of now. Please help!
Thanks a lot!

The attachment is full code and model for reproducing this issue.

Environment

TensorRT Version: 8.0.1.6, C++ API
GPU Type: Tesla P4
Nvidia Driver Version: 440.33.01
CUDA Version: 10.2
CUDNN Version: 8.2
Operating System + Version: Ubuntu 16.04
Python Version (if applicable): NA
TensorFlow Version (if applicable): NA
PyTorch Version (if applicable): NA
Baremetal or Container (if container which image + tag): NA

Relevant Files

The attachment has full code, model, CMakeLists. Just modify the TensorRT path in CMakeLists and the building should work.
trt-conc.zip (15.0 MB)

Steps To Reproduce

compile and produced bin: concurrency_test

mkdir build && cd build
cmake ..
make -j4

Then run:
./concurrency_test ../mobilenetv1/params image softmax_0.tmp_0 1 1 2

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Tried trtexec, I found some clues.
For dynamic shape model with multiple execution contexts, the --minShapes, --optShapes, --maxShapes should have same batch_size (explicit batch).
for example:

  1. this is working well
./trtexec --onnx=./mobilenetv1/params --minShapes=image:2x3x224x224 --optShapes=image:2x3x224x224 --maxShapes=image:2x3x224x224 --streams=2 --explicitBatch --shapes=image:2x3x224x224
  1. this is not
./trtexec --onnx=./mobilenetv1/params --minShapes=image:1x3x224x224 --optShapes=image:2x3x224x224 --maxShapes=image:4x3x224x224 --streams=2 --explicitBatch --shapes=image:2x3x224x224

with error msg:

[04/07/2022-12:39:36] [E] Error[3]: [executionContext.cpp::setBindingDimensions::949] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::949, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles()
)
[04/07/2022-12:39:36] [E] Inference set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=./mobilenetv1/params --minShapes=image:1x3x224x224 --optShapes=image:2x3x224x224 --maxShapes=image:4x3x224x224 --streams=2 --explicitBatch --shapes=image:2x3x224x224

From the trtexec source code, I found multiple profiles actually is not supported as of now, even in the latest version of TensorRT(8.4).

if (nOptProfiles > 1)
{
sample::gLogWarning << “Multiple profiles are currently not supported. Running with one profile.” << std::endl;
}

Isn’t it correct to say: dynamic shape model serialied engine has fixed batch_size, not a range, I cannot use this engine for less batch_size inference but the same batch_size?

Yes. Please refer following similar issue. Currently, --streams with dynamic shapes not supported in TRT.

This solved my puzzle. Thanks!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.