Triton Inference server : Inference on multi-gpus and load balancing across gpus

Inference Tensorrt model on multi GPU’s works as expected, untill if gpus belongs to same gpu family. it loads same model on all the GPU’s by specifying gpus:[0,1] in instance_group (config.pbtxt). only one infer api call is enough to handle the load balancing across GPU’s.

How to handle Inference Tensorrt model on multi GPU’s, if GPUs belongs to different families and how can we achieve optimized load balancing as above?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)

• DeepStream Version

• JetPack Version (valid for Jetson only)

• TensorRT Version

• NVIDIA GPU Driver Version (valid for GPU only)

• Issue Type( questions, new requirements, bugs)

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

@fanzh, Thank you for your response. My query is more general and not specific to a particular setup or issue. I am exploring how to optimize inference load balancing using TensorRT across GPUs of different families (e.g., combining an Ampere GPU with a Pascal GPU) in a multi-GPU setup.

To clarify:

This is not related to a specific hardware platform or software version but is a conceptual question.
I would like to understand the best practices or general approach for:
    Deploying a TensorRT model across GPUs of different families.
    Ensuring optimized load balancing (similar to how it's done when GPUs belong to the same family).

If such configurations are not natively supported, I would also appreciate insights or suggestions for alternative strategies.

Thank you for your assistance!

Here is DeepStream forum. triton questions would be outside of DeepStream. I suggest asking in triton forum. Thanks!