How to inference with tensorrt on multi gpus in python


Hi, i have 2 different tensortrt models, i want to run trt model A on gpu 1, and run trt model B on gpu 2 with python.
A clear and concise description of the bug or issue.


Ubuntu 18.04
TensorRT Version:
GPU Type: V100
Nvidia Driver Version: 418
CUDA Version: 10.2
CUDNN Version: 8.1
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

The below link might be useful for you
For multi threading/streaming, will suggest you to use Deepstream or TRITON
For more details, we recommend you to raise the query to the Deepstream or TRITON forum.


Hi @lizcomeon,

Following link may answer your query.

Thank you.