How to do two different inference with TensorRT on two different GPU on same machine or PC

Description

I converted pytorch model to engine and i doing inference with tensorRT in C++ on GPU(NVIDIA RTX A2000).

i want to do two parallel inference on two different GPU on same machine.
How that can done please explain me??

Hi,

We hope the following may help you.

  • Create two TensorRT contexts, one for each GPU. Each context will contain its own engine and execution context.
  • Set the device for each context to the corresponding GPU.
  • Create two CUDA streams, one for each context.
  • Enqueue the inference requests for each context to the corresponding CUDA stream.
  • Synchronize the CUDA streams to wait for all of the inference requests to complete.

Thank you.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.