Does nvinferserver supports load balancing between two GPU's?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 5.1
We are using deepstream-triton image available on NGC. We have created a pipeline using deepstream (nvinferserver plugin). We have noticed that only GPU-0 goes up to 100% utilization whereas GPU-1 sits idle. Doesn’t nvinferserver supports load balancing?

Please have a try to set gpu_ids.

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinferserver.html#id10

Hello, Thanks for the response. It says “Device IDs of GPU to use for pre-processing/inference (single GPU support only)”. Does that mean nvinferserver can run inference either on GPU-0 or GPU-1. It can not do load balancing the way Triton is doing. Is that correct understanding?

What is the mean of “do load balancing the way Triton is doing.”?

Nvidia triton server has a capability to share the inference load on multiple gpu’s. I have shared this image of standalone triton server where I have run inference on multiple images, GPU-0 is utilized 28% and GPU-1 is utilized 30%.load_balancing|538x500

But when I run deepstream code with nvinferserver, only GPU-0 is utilized and GPU-1 sits idle. If nvinferserver is creating its own nvidia-triton-server, why it is not doing load balancing?