When using nvinferserver on a multi-server gpu server, configuring gpu_ids to != [0] still always allocates memory on gpu ID 0, additionally to the specified GPU.
• Hardware Platform dGPU
• DeepStream Version 6.2-triton docker
• NVIDIA GPU Driver Version 525.105.17
• Issue Type bug
• How to reproduce the issue ?
Reproduce with the following pipeline on a server with e.g. 4 GPUs in deepstream 6.2-triton docker:
export USE_NEW_NVSTREAMMUX=yes
export VIDEO=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
gst-launch-1.0 filesrc location=$VIDEO ! h264parse ! nvv4l2decoder gpu-id=2 ! mux.sink_0 nvstreammux name=mux ! nvinferserver config-file-path="./config_triton_grpc_infer.txt" ! fakesink
Notice nvv4l2decoder gpu-id=2
sets decoding to GPU ID 2. Set gpu_ids: [2]
in config_triton_grpc_infer.txt
to match that.
Will result in 144 MB on GPU 2, but additional 102 MB on GPU 0:
For nvv4l2decoder gpu-id=1
and nvinferserver
gpu_ids: [1]
For nvv4l2decoder gpu-id=0
and nvinferserver
gpu_ids: [0]
No additional memory needed now that both gpus are ID 0.
For nvv4l2decoder gpu-id=2
and nvinferserver
gpu_ids: [1]
0:00:00.653846585 2030 0x55d37a57e980 WARN nvinferserver gstnvinferserver.cpp:628:gst_nvinfer_server_submit_input_buffer:<nvinferserver0> error: Memory Compatibility Error:Input surface gpu-id doesn't match with configured gpu-id for element, please allocate input using unified memory, or use same gpu-ids OR, if same gpu-ids are used ensure appropriate Cuda memories are used
0:00:00.653882336 2030 0x55d37a57e980 WARN nvinferserver gstnvinferserver.cpp:628:gst_nvinfer_server_submit_input_buffer:<nvinferserver0> error: surface-gpu-id=2,nvinferserver0-
[ERROR push 333] push failed [-5]
Which is an expected outcome, because buffers are on gpu 2, but nvinferserver is on gpu 1.
Crosschecking that nvinferserver is the plugin that allocates the additional memory on GPU 0 by omitting it from the pipeline:
gst-launch-1.0 filesrc location=$VIDEO ! h264parse ! nvv4l2decoder gpu-id=3 ! mux.sink_0 nvstreammux name=mux ! fakesink
Only allocates memory on GPU ID 3.
• Requirement details
Specifying gpu_ids should not additionally allocate memory on GPU ID 0. Buffers should be on same GPU and also processing should take place only on that GPU.
If possible please confirm the issue and provide workarounds for Deepstream 6.2.