Deepstream NvDCF tracker on running on 2 GPUs with nvinferserver

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU (2x Tesla T4)
• DeepStream Version 7.0 and 7.1
• TensorRT Version nvcr.io/nvidia/deepstream:7.1-gc-triton-devel
• NVIDIA GPU Driver Version (valid for GPU only) 535
• Issue Type( questions, new requirements, bugs) bugs
• How to reproduce the issue ?
We have a multi-GPU setup and the goal is to choose the GPU on which to run the pipeline based on our internal logic. The pipeline uses nvinferserver with grpc and nvtracker. It works fine in all the cases when we set gpu-id=1 for all plugins but when we use nvtracker with NvDCF configuration, it occupies memory in both cards as shown in the screenshot


It shows 2 gst-launch-1.0 process with memory allocated to both GPUs. Here are the steps to reproduce it with sample apps

docker run --gpus all -it --rm --net=host --privileged -v /tmp/.X11-unix:/tmp/.X11-unix -w /opt/nvidia/deepstream/deepstream-7.1 nvcr.io/nvidia/deepstream:7.1-gc-triton-devel
./prepare_ds_triton_model_repo.sh
tritonserver --model-repository=samples/triton_model_repo
# update samples/configs/deepstream-app-triton-grpc/config_infer_plan_engine_primary.txt to 
  gpu_ids: [1]
gst-launch-1.0 nvurisrcbin uri=rtsp://0.0.0.0:8554/test gpu-id=1 ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 gpu-id=1 ! nvinferserver config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app-triton-grpc/config_infer_plan_engine_primary.txt ! nvtracker tracker-width=640 tracker-height=384 gpu-id=1 ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_max_perf.yml ! fakesink

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
Using IOU or no tracker eliminates this redundant process, which is currently affecting our scalability significantly. The redundancy increases VRAM requirements, limiting our ability to load multiple models. If possible, please confirm the issue and let us know if there are any available workarounds.

Can you have a try to set envirenment value: CUDA_VISIBLE_DEVICES=1?

It works after setting the environment variable CUDA_VISIBLE_DEVICES=1, and only one process is shown. However, this disrupts the full pipeline as the GPU ID changes, affecting scalability. I need to set the GPU ID to 0 in each plugin and configuration file to make it work, but it’s actually running on GPU ID 1.

This requires management of the CUDA_VISIBLE_DEVICES environment variable to match the desired GPU before starting any new pipeline, which essentially makes the gpu-id parameter in each DeepStream plugins ineffective.

Is there an alternative solution for this? The current approach complicates GPU management for other processes and reduces the usability of the gpu-id parameter.

Can you see the same issue with below simple command line?

Modify to GPU 1 in: /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt

$ gst-launch-1.0 nvurisrcbin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 gpu-id=1 ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 gpu-id=1 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvtracker tracker-width=640 tracker-height=384 gpu-id=1 ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_max_perf.yml ! fakesink sync=1

This pipeline works well and don’t create any extra process or takes extra VRAM in second GPU.
I also tried other pipeline combinations as well and gpu-id works fine. For example
nvinferserver → nvtracker with IOU → nvdsosd → fakesink

It only happens when we use nvinferserver and nvtracker with nvdcf configuration. It shows multiple process and takes extra VRAM.

Seems the issue is in tritonserver. Are you modify the gpu id setting in: triton_model_repo/Primary_Detector# vim config.pbtxt

Yes, I tried by modifying gpu id setting in triton model as well but it shows same issue. GPU usage is 0% but it takes extra VRAM.
Also it doesn’t seems to be tritonserver issue as it works fine when running with other components like nvdsosd which also use GPU.