Inference issue queuing up on one GPU


I am continuing to this issue How to do two different inference with TensorRT on two different GPU on same machine or PC.

I am creating as per your mension in this post. Through this i am able to load model on both GPU. But while inferencing both inferences running on one GPU and getting queuing error. Instead of running seperately.


TensorRT Version: 8.3.2
GPU Type: RTXa2000
Nvidia Driver Version:
CUDA Version: 11.2
CUDNN Version: 8.4
Operating System + Version: windows
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered