• Hardware Platform (GPU ) RTX3080 ×2
• DeepStream Version 6.0.1
• TensorRT Version 220.127.116.11
**• NVIDIA GPU Driver Version ** 470.82.01
• Issue Type( bugs)
We found some extra memory usage in GPU0(default) when using segmentation model in multi gpu application.
• How to reproduce the issue ?
we set all the gpu-id as 1
./deepstream-segmentation-app dstest_segmentation_config_industrial.txt /opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_industrial.jpg
it is a reproducible bug, nvinfer plugin is opensource, please use this workaround:
- modify /opt/nvidia/deepstream/deepstream-6.1/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp, like this:
static gpointer gst_nvinfer_input_queue_loop (gpointer data)
GstNvInfer *nvinfer = (GstNvInfer *) data;
- compile, then copy libnvdsgst_infer.so to /opt/nvidia/deepstream/deepstream/lib/gst-plugins, backup old libnvdsgst_infer.so first.
@wlzkobe Can you let us know if the above workaround wor for your case? thanks.
No, it still doesn’t work
We’ve tested this workaround, but still the same result.
2. compile, then copy libnvdsgst_infer.so to /opt/nvidia/deepstream/deepstream/lib/gst-plugins, backup old libnvdsgst_infer.so first.
We know the correct path since it’s a gstreamer plugin lib.
GPU 0 memory usage happend when the program running for a moment not the very early time.
- It seems no improvement.
- We also use this demo and the same command.
But our sdk version is 6.0.1 since the latest 6.1 need Ubuntu 20,
and this is the source code we modified.
deepstream_segmentation_app.c (13.8 KB)
This problem is assured to be
nvinfer plugin’s bug.
It can simply reprocude using
gst-launch-1.0 command like this:
rtspsrc location=rtsp://RTSP_RESOURCE latency=200 drop-on-latency=1 ! rtph264depay ! \
nvv4l2decoder gpu-id=1 ! m.sink_0 \
nvstreammux gpu-id=1 name=m batch-size=1 width=1280 height=720 batched-push-timeout=40000 ! \
nvinfer gpu-id=1 config-file-path=INFER_CONFIG_FILE !```
This is annoying!
yes, can you try the fix in comment 5？
I just testd the fix, and it works on 2080Ti / deepstream 6.1-dev / nvidia driver 510.
| 0 N/A N/A 25172 C gst-launch-1.0 159MiB |
| 1 N/A N/A 25172 C gst-launch-1.0 817MiB |
| 1 N/A N/A 23992 C gst-launch-1.0 825MiB |
Would you further give a brief explanation? I’ve almost read through codes of
nvinfer, but I can’t figure out how your one-line fix works.
thanks for your update, please refer to cudaSetDevice explanation,:CUDA Runtime API :: CUDA Toolkit Documentation need to Sets device as the current device in thread gst_nvinfer_input_queue_loop.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.