Description
While running trtexec with a command
trtexec --loadEngine=whisper-tiny-decoder-sim.engine --exportProfile=whisper-tiny-decoder-sim.engine.profile.json --verbose --noDataTransfers --useCudaGraph --separateProfileRun
,
I’ve got the following error.
[08/28/2023-06:09:53] [E] Error[1]: [executionContext.cpp::syncShapeBindingsToDevice::1990] Error Code 1: Cuda Runtime (context is destroyed)
trtexec: samples/common/sampleInference.cpp:844: void sample::{anonymous}::Iteration<ContextType>::createEnqueueFunction(const sample::InferenceOptions&, nvinfer1::IExecutionContext&, sample::Bindings&) [with ContextType = nvinfer1::IExecutionContext]: Assertion `ret' failed.
I found out without the --separateProfileRun
option on the command, it runs successfully without the error.
However I believe it’s more accurate to get the latency with this option on.
I wonder if it could be a bug or if there’s any solution recommended.
Environment
TensorRT Version: 8.5.2-1+cuda11.8
GPU Type: RTX A6000
Nvidia Driver Version: 510.108.03
CUDA Version: 12.0
CUDNN Version: 8
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt 23.01-py3
Relevant Files
Attaching the input onnx file [link]
Steps To Reproduce
- Build trtexec on the above docker image
trtexec --loadEngine=whisper-tiny-decoder-sim.engine --exportProfile=whisper-tiny-decoder-sim.engine.profile.json --verbose --noDataTransfers --useCudaGraph --separateProfileRun