Issue with nvv4l2decoder while using rtsp-reconnect-interval

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 7.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.6.1
• NVIDIA GPU Driver Version (valid for GPU only) 550.127.05
• Issue Type( questions, new requirements, bugs) Bugs

I am using rtsp-reconnect-interval for resetting the RTSP streams. Some of the streams I am using are unstable, so it will go down and up quite often. I am running the pipeline for almost a week and it was working fine until today but now I start getting errors like:

gst-core-error-quark: Your GStreamer installation is missing a plug-in. (12),../gst/playback/gstdecodebin2.c(4701): gst_decode_bin_expose (): /GstPipeline:pipeline0/GstBin:source-bin-61/GstDsNvUriSrcBin:uri-decode-bin/GstDecodeBin:decodebin:
no suitable plugins found:
Couldn't set nvv4l2decoder149 to READY:
Could not open resource for reading and writing.
Could not open device '/dev/nvidia0' for reading and writing.
v4l2_calls.c(671): gst_v4l2_open (): /GstPipeline:pipeline0/GstBin:source-bin-61/GstDsNvUriSrcBin:uri-decode-bin/GstDecodeBin:decodebin/nvv4l2decoder:nvv4l2decoder149:
system error: No such file or directory

After this rtsp-reconnect-interval is not working:

gst-stream-error-quark: Internal data stream error. (1),../gst/rtsp/gstrtspsrc.c(6252): gst_rtspsrc_loop (): /GstPipeline:pipeline0/GstBin:source-bin-42/GstDsNvUriSrcBin:uri-decode-bin/GstRTSPSrc:src:
streaming stopped, reason not-linked (-1)

Is this happening because of too many resets over time, and each time the reset happens a new decoder instance is created?

  1. are you testing in docker or host?
  2. are you testing deepstream-app? could you share the configuration file?
  3. could you share the deepstream app log? wondering after how many times reconnections before the app run into the issue?
  1. It is running in docker (nvcr.io/nvidia/deepstream:7.0-triton-multiarch)
  2. No, I’m running a custom python code. I’m running around 40 rtsp streams. The pipeline was working before implementing rtsp-reconnect-interval.
  3. Almost 148 reconnections happened. If you look at the error nvv4l2decoder149 , each time when the connection resets it’ll be like nvv4l2decoder1, nvv4l2decoder2, nvv4l2decoder3, ...

@fanzh hey could you provide any guidance to address this issue? Any possible reasons for this issue?

which sample are you referring to? what is the media pipeline? can you use DeepStream sample to reproduce this issue? for example, if restarting the rtsp camera every minute, can this “Could not open device ‘/dev/nvidia0’” issue be reproduced?

@fanzh
nvurisrcbin --> streammux --> nvinfer --> fakesink

I’ll try to replicate this with the sample app, but l only faced this issue once while running it for almost a week with 40 rtsp streams where some of them are unstable.
I tried resetting the rtsp cameras every few minutes for almost 2 days but this was not replicated.

Based on the error, do you have any idea where it could be related to? Something with the GPU/system? Also, what will be the best way to recover from this?

please refer to this table. first the component versions need to meet the requirements.

Is this still an DeepStream issue to support? Thanks! what is the dgpu device model? Can you share the log generated with “sudo nvidia-bug-report.sh” when issue happens?

@fanzh My NVIDIA driver was using the 550 family, but I saw that the supported version was the 535 family. I have changed it, and testing is ongoing.

Where can I find the file nvidia-bug-report-tegra.sh?

Noticing you are using Dgpu, if you can find the script, please refer to nvidia-bug-report.sh.

nvidia-bug-report.log (27.3 KB)
@fanzh It is still happening with nvidia driver 535-x as well. Please find the attached report log.

@fanzh It feels like it is an issue with the nvidia GPU/driver because nvidia-smi was not working inside the docker (Failed to initialize NVML: Unknown Error) but works fine in the host.

One thing I found is that NVIDIA persistence mode was disabled by default. I have enabled it now and restarted the pipeline. There is a possibility that the GPU went idle and while resetting nvv4l2decoder was unable to access the GPU.

Thanks for the update. “Failed to initialize NVML: Unknown Error” should be related to “Driver/library version mismatch”. To narrow down this issue, can you please install R535.161 driver according to the compatibility table?

Sorry for the late reply, Is this still an DeepStream issue to support? Thanks!
could you share the docker start command-line? After starting the docker container, does “Failed to initialize NVML” appear at the beginning or after running the app for a long time?

@fanzh “Failed to initialize NVML” is only appearing after running the app for a long time. At the beginning it is working fine.

  1. what is the GPU model? could you get more log by "dmesg | grep NVRM”, taking this link for example. to narrow down this issue, did you try R535.161 driver? Thanks !
  2. if you suppose “resetting nvv4l2decoder” cause the error, you can restart the camera frequently or try other methods to simulate RTSP frequent reconnection.
  3. To narrow this issue, could you use deepstream sample deeptream-app to reproduce this issue?

The issue occurred because NVIDIA persistence mode was disabled by default. After enabling it, everything is working fine.

Thanks for the sharing!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.