Real Time Inference with Multi GPU - Multiple Model

kaancolak95 · January 27, 2020, 3:51pm

Hi everyone,

I have multiple gpu and multiple different models. I coverted models to TensorRT for high performance. I need to get the highest fps thats possible. Which one is more suitable for this problem, using Nvidia Inference Server or allocate TensorRT models to specific gpu ? I will send real time sensor data over ROS, inference server made for data centers, this is a little confused my mind.

Thanks.

David_Goodwin · January 29, 2020, 7:20pm

You can use either TRTIS to serve your TensorRT models or you could write a custom application that uses the TensorRT APIs to server your models. The benefit of TRTIS is that it is easy to use and gives you many options like multi-instance, multi-gpu, dynamic batcher, etc. Of course, you could write those features into your own custom application but that is probably time and effort you don’t want to spend.

Note that TRTIS allows you to map different models to different GPUs if that is what you want. Look at the protobuf and documentation on the instance_groups feature in the model configuration.

Topic		Replies	Views
TF-TRT5: How to run tensorflow-tensorrt inferences with multiple GPUs TensorRT	10	3768	September 3, 2019
Is it possible to run multiple TensorRT model inference on a GPU simultaneously and parallelly? TensorRT tensorrt , cuda	3	2233	August 23, 2022
How to specify the GPU to do the inference when there are multiple GPUs installed? TensorRT	0	581	June 13, 2019
Multiple model Inference And Runtime Model Switching Isaac ROS ros , isaac-ros-dnn-inference	3	1003	May 13, 2024
Running Real-Time Instance Segmentation with Local GPUs TensorRT tensorrt , camera , ros , python , cudnn	2	143	February 18, 2025
Triton-server model load balancing DeepStream SDK inference-server-triton	6	1069	February 8, 2023
How to run multi trt model instance in single gpu efficentilly? TensorRT	0	719	June 20, 2019
How to inference with tensorrt on multi gpus in python TensorRT	2	2268	April 9, 2021
Using Multiple GPUS TensorRT	0	582	August 20, 2019
Multi-model parallel inferencing TensorRT	1	450	March 31, 2023

Real Time Inference with Multi GPU - Multiple Model

Related topics