Is enqueueV2 thread safe?

Description

The document clearly states that safe::IExecutionContext::enqueueV2 is NOT thread safe: link

But there is no description for the regular version of “enqueueV2”.

Currently, I’m using multithreading:
each thread creates its own nvinfer1::IExecutuionContext and cudaStream.
But I got wrong results with some random values.

If I make the calling of enqueueV2 sequentially, the result is OK:

{
   std::lock_guard<std::mutex> lk(io_mutex);
   ctx->enqueueV2(bindings.data(), stream, nullptr));
}

Environment

TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,

The below links might be useful for you.

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html

For multi-threading/streaming, will suggest you to use Deepstream or TRITON

For more details, we recommend you raise the query in Deepstream forum.

or

raise the query in Triton Inference Server Github instance issues section.

Thanks!

Based on the source code of trtexec, it looks like ExecutionContext::enqueueV2 IS thread safe because this is no mutex. Is that correct?