Tensorrt multi gpu with multi threads


Hello, All. I am use tensorrt to inference the AI module in c++.
But I met a problem with multi-gpu and multi threads.

  • first . i build tensorrt module from multi thread (one gpu with one thread).
  • seoncd, As we know, tensorrt use multi gpu should call cudaSetDevice in create engine and infer. like.

But, I found when one thread enter ‘cudaStreamCreate’ or ‘cudaMemcpy’ or ‘enqueueV2(infer context)’ or other cuda methods. AT this time, if other threads enter.
the program will blocking. if I use a mutex to lock before any infer. it will ok. But the performance is bad. Could any one help me?


TensorRT Version:
GPU Type: rtx-3070 (notebook)
Nvidia Driver Version: 470.74
CUDA Version: 11.1
CUDNN Version: 11.1
Operating System + Version: ubuntu 18.04 with linux kernel 5.4.0-99
Python Version (if applicable): no
TensorFlow Version (if applicable): no
PyTorch Version (if applicable): no
Baremetal or Container (if container which image + tag):

Relevant Files

— later …if need.

Steps To Reproduce

The below link might be useful for you
For multi threading/streaming, will suggest you to use Deepstream or TRITON
For more details, we recommend you to raise the query to the Deepstream or TRITON forum.