Jetson AGX Orin (DeepStream 7.1) – Smart Record / custom recording triggers NvBufSurface CUDA faults and NVENC crashes with 13 RTSP sources

  • Hardware Platform: Jetson AGX Orin 64 GB.

  • JetPack Version: JetPack 6.2.1 (L4T 36.4.7)

  • DeepStream Version: DeepStream 7.1.

  • TensorRT Version: 10.7.0

  • Issue Type: Bug report / help request for stability when using either a custom recording plugin or NVIDIA Smart Record across multiple sources.

App:

Custom C++ DeepStream app based on gst_parse_launch. Pipeline structure (13 RTSP cameras):

nvurisrcbin (per camera)
→ queue → tee
tee → nvstreammux → nvinfer → nvtracker → nvdsosd → display
tee → per-source smart-record branch (see below)

Recording paths tried:

  1. Home-grown recordplugin (GstBaseTransform) that deep-copies the incoming GstBuffer, feeds an appsrc-based H.264 MP4 pipeline, and generates first/last snapshots from the same recorded video.

  2. NVIDIA Smart Record via NvDsSRContext. Each source taps off the post-OSD tee and runs its own encoder → parser → NvDsSR recordbin.

Issue A: Custom Record Plugin - CUDA Illegal Memory Access & System Reboot

Repro:

  1. Each detection triggers recordplugin (signals record-start / record-stop), launching its internal appsrc→nvvideoconvert→nvv4l2h264enc→mp4mux→filesink pipeline and a separate snapshot pipeline for JPEGs.

  2. After a few start/stop cycles across 13 sources, logs show:

    /dvs/git/…/nvbufsurftransform_copy.cpp:438: => Failed in mem copy
    libnvosd … Unable to map EGL Image
    cudaErrorIllegalAddress … nvll_osd/memory.hpp:59
    Segmentation fault (core dumped) build/app

  3. Kernel log (kernel.log) contains GPU faults (e.g., nvgpu … source enable violation, cudaErrorIllegalAddress) followed by a watchdog-triggered reboot.

Hypothesis: deep-copying NVMM buffers in the plugin, then pushing them through additional GPU elements (nvvideoconvert, nvjpegenc) races with DeepStream’s buffer lifecycle, leading to stale pointers and illegal device access.

Issue B: Smart Record (NvDsSR) Exhausts NVENC and Crashes

After removing the custom plugin, Smart Record was integrated per NVIDIA docs:

demux.src_i → queue → nvvideoconvert → nvdsosd → tee_rec_i
tee_rec_i → queue → nvvideoconvert → nvv4l2h264enc → h264parse → NvDsSR recordbin

===== NvVideo: NVENC =====
NVMEDIA: Need to set EMC bandwidth …
NVMAP_IOC_GET_FD failed: Bad address
nvbufsurface: Failed to create EGLImage
libnvosd … Unable to map EGL Image
/dvs/git/…/nvbufsurftransform.cpp: => NvVicCompose Failed
gstnvtracker: NvBufSurfTransform failed with error -2 …
nvinfer error: Internal data stream error (streaming stopped, reason error (-5))

After the NVENC errors, the pipeline collapses and the system may reboot. Each Smart Record branch creates its own nvv4l2h264enc, so we simply exceed the Orin’s encoder capacity.

Workaround: switch the Smart Record branch to CPU encoding (nvvideoconvert → videoconvert → x264enc → h264parse → NvDsSR). That avoids NVENC, but doing x264 on 13 HD streams pushes CPU usage very high, so it’s not a viable long-term solution

Full log excerpt attached (smart_record_error.log)

libnvosd (1386):(ERROR) : cuGraphicsEGLRegisterImage failed : 1
nvbufsurface: Failed to create EGLImage.
/dvs/git/…/nvbufsurftransform.cpp:4814: => NvVicCompose Failed
gstnvtracker: NvBufSurfTransform failed with error -2 …
nvinfer error: Internal data stream error.

What We Need

  1. Guidance for custom record plugin:

    • How to safely clone or share surfaces for out-of-band pipelines without triggering cudaErrorIllegalAddress? Is there a recommended method to convert DeepStream buffers to CPU memory for appsrc pipelines without racing the pool?
  2. Smart Record across many sources:

    • Best practice for using Smart Record when ~13 cameras may record simultaneously. Can NvDsSR reuse already encoded frames (e.g., from nvstreammux) so we don’t need a separate encoder per source?

    • Is there a supported way to mux the demuxed streams into Smart Record without instantiating nvv4l2h264enc for every source?

  3. General robustness:

    • Any DeepStream or Jetson settings we can tweak (memory pool sizes, NvBufSurface cache settings, etc.) to keep the pipeline stable during frequent record start/stop sequences?

Recordings are triggered whenever nvinfer detects the target class; each trigger kicks off Smart Record (or previously, our custom recordplugin).

Any insights or recommended pipeline adjustments from the DeepStream team would be greatly appreciated. Happy to share additional logs or code snippets if needed.

Attached are all the logs/errors I have encountered so far.

cudaErrorIllegalAddress.log (9.4 KB)

kernal.log (73.0 KB)

recording.log (1.2 KB)

smart_record_error.log (14.3 KB)

  1. What is the frequency of the recording? E.G. recording 3x10s videos per hour per RTSP source.

  2. For Issue A, we don’t know details about the implementation. From your description, the decoded frames from camera are fed into the appsrc recording pipeline, right? The deep-copying NVMM buffers may not work.

  3. For Issue B, the recording happens after nvstreamdemux, it is a different pipeline and data path as the app you describe in the beginning. Can you provide the complete pipeline and detailed configurations?

There is IPC NvBufSurface sharing sample in
/opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-ipc-test, it is for Jetson platform only.

Does this mean you never met any problem with less than 13 cameras?

What do you mean by this? The nvstreammux never encode frames.

Is there any evidence to show the pipeline failure is related to memory, cache,…?

Thank you Fiona for your response:

Here are the questions you asked:

  1. Each RTSP source typically records 2-4 ten-second clips per minute during peak load (bursts of back‑to‑back detections). So a single camera might record ~120-130 clips/hour when busy. All 13 sources can spike at once because the detections come from model which trigger the recording events. simply put, when there’s a detection, the recording will start for next 10 seconds and then store it to a directory.

  2. gstreamer/deepstream pipeline for issue A: (This pipeline is for just one URI, and I have 12 more such as this)

    nvurisrcbin name=src_0 uri=rtsp://… latency=200 drop-on-latency=false select-rtp-protocol=4 rtsp-reconnect-interval=5 rtsp-reconnect-attempts=-1 ! queue name=qsrc_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! tee name=tee_0
    tee_0. ! queue name=qstream_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! mux.sink_0
    nvstreammux name=mux batch-size=1 width=1920 height=1080 enable-padding=1 live-source=1 sync-inputs=0 batched-push-timeout=25000 buffer-pool-size=4 attach-sys-ts=1
    ! nvinfer name=infer config-file-path=configs/config_pv_d7.1.yml batch-size=1
    ! nvtracker name=tracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_max_perf.yml
    ! tee name=t
    t. ! queue name=qtiler max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvmultistreamtiler name=tiler width=1920 height=1080 ! nvvideoconvert name=conv_tile ! nvdsosd name=osd ! nveglglessink name=sink sync=false
    t. ! queue name=qdemux max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvstreamdemux name=demux
    demux.src_0 ! queue name=qrecord_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvvideoconvert name=conv_0 ! nvdsosd name=osd_0 ! recordplugin name=rec_0 ! fakesink name=sink_0 sync=false async=false
    

    the recordplugin is the custom version of this plugin

  3. gstreamer/deepstream pipeline for issue B:

    nvurisrcbin name=src_0 uri=rtsp://… latency=200 drop-on-latency=false drop-frame-interval=1 select-rtp-protocol=4 rtsp-reconnect-interval=5 rtsp-reconnect-attempts=-1 smart-record=2 smart-rec-dir-path=“videos” smart-rec-file-prefix=“recording” smart-rec-cache=6 smart-rec-container=0 smart-rec-mode=1 smart-rec-default-duration=15 ! queue name=qsrc_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! tee name=tee_0
    tee_0. ! queue name=qstream_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! mux.sink_0
    nvstreammux name=mux batch-size=1 width=1920 height=1080 enable-padding=1 live-source=1 sync-inputs=0 batched-push-timeout=25000 buffer-pool-size=4 attach-sys-ts=1
    ! nvinfer name=infer config-file-path=config.yml batch-size=1
    ! nvtracker name=tracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_max_perf.yml
    ! tee name=t
    t. ! queue name=qtiler max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvmultistreamtiler name=tiler width=1920 height=1080 ! nvvideoconvert name=conv_tile ! nvdsosd name=osd ! nveglglessink name=sink sync=false
    t. ! queue name=qdemux max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvstreamdemux name=demux
    demux.src_0 ! queue name=qrecord_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvvideoconvert name=conv_0 ! nvdsosd name=osd_0 ! queue name=qdrain_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! fakesink name=sink_0 sync=false async=false
    
  4. Thank you for letting me know about deepstream-ipc-test app, I wil check it out today.

  5. a. No, the problem exist for any number of cameras ranging from 2 to 16 (not certain if I faced issues with only 1 camera though, but I will check it out and let you know)
    b. And apologies, I phrased that poorly. I meant tapping the encoded bitstream coming from each nvurisrcbin (before decode) so Smart Record can reuse it, instead of decoding to raw and then re-encoding. I now understand nvstreammux doesn’t encode; the idea is to intercept the bitstream upstream, similar to what deepstream-app does in deepstream_source_bin.c.

  6. We don’t have hard proof that it’s a pool-size problem. the errors show EGL/NvBufSurface failures and CUDA illegal access, which could simply stem from the stale buffer . We mentioned memory/cache tweaks mainly because the crash logs are full of nvbufsurface: Failed to create EGLImage and NVMAP_IOC_GET_FD failed, but we don’t have metrics showing we exhausted memory

The smart recording function has already been integrated into nvurisrcbin. You don’t need to add the smart recording function in other places in the pipeline. There is also sample for triggering “sr-start” and “sr-stop” as NVMultiurisrcbin with smart record - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums.

You can also refer to the source code of nvurisrcbin for how the smart recoding work inside it. /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvurisrcbin

The nvurisrcbin document may also help Gst-nvurisrcbin — DeepStream documentation
NVIDIA DeepStream SDK API Reference: _GstDsNvUriSrcBinClass Struct Reference | NVIDIA Docs

For DeepStream 7.1 GA on JetPack6.2.1, please refer to DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums too.

1 Like

Thank you for the links @Fiona.Chen , I’ll definitely go through them.

We tried both smart record approaches (the custom implementation and the one included with nvurisrcbin), but both resulted in crashes. Additionally, we need to record videos with bounding boxes, so the built-in smart record option from nvurisrcbin isn’t very useful for our case, as it records only the raw video without overlays (based on what I found in the documentation).

I also tested the same code on a Jetson AGX Thor, and the device did not crash or restart there.

The nvvideoconvert issue has been resolved in DeepStream 8.0 GA. So it will not cause any problem in AGX Thor which only works with DeepStream 8.0 GA.

1 Like

There are nvvideoconvert inside nvurisrcbin and your pipeline. So the nvvideoconvert issue should be resolved first.

1 Like

Yes, you’re right.

The issue might be related to NvVideoConvert, but it could also stem from NvOSD or NvTiler.

I’m recording video with detection bounding boxes using my custom plugin. When I send the output to the display using nvosd + nvtiler, I encounter CUDA Illegal Memory Access errors. However, when I direct the output to a fakesink, the errors don’t occur. In my theory the issue arises from NvOSD or NvTiler, although I don’t yet have solid evidence to confirm it.

I will gather additional logs and share them here for further analysis. Meanwhile, I’ll review all the links you provided and try implementing the deepstream-ipc-test

Do you mean the “nvinfer->nvmultistreamtiler->nvvideoconvert->nvdsosd->nv3dsink” pipeline failed while “nvinfer->fakesink” pipeline success?

Hi, @aashish :

The nvvideoconvert caused “nvbufsurftransform_copy.cpp: Failed in mem copy” with JetPack 6.2.x is a known issue, we need to use the workaround in Troubleshooting — DeepStream documentation.

For the nvurisrcbin, the nvvideoconvert inside it should also been modified to apply the workaround.
diff.txt (1.0 KB)
The attached patch can be applied to /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvurisrcbin/gstdsnvurisrcbin.cpp. The plugin can be rebuilt and the generated “libnvdsgst_nvurisrcbin.so” can be used to replace the library in /opt/nvidia/deepstream/deepstream/lib/gst-plugins/

1 Like

Thank you, Fiona, for providing the patch. I was able to apply it to DeepStream 7.1, and I’m currently stress testing it to verify if the error recurs. I’ll keep you updated on the results after testing the application on the Jetson device for at least 24 hours

During my stress testing on Jetson Thor for 24 hours, the application crashed only once, with the same cudaErrorIllegalAddress, but this occurred on fakesink.

I’m still unsure of the root cause, but if you’re open to it, I can apply the patch you provided on DeepStream 8.0 as well to see if it makes a difference.

Summary:

  1. The Jetson AGX Orin crashes approximately every 30 minutes, whether or not fakesink is used
  2. The Jetson Thor only crashed once with fakesink, but instead of a device restart, I encountered a segmentation fault

Please do not use the patch with DeepStream 8.0 because the hardware and software are all different between DeepStream 7.1 and DeepStream 8.0.

Please add “copy-hw=2” property to all nvvideoconvert plugins which are used in your app. Then you can apply the patch I give with nvurisrcbin.

Which pipeline do you use on DeepStream 8.0 + AGX Thor?

Ah okay. I won’t be using this patch anywhere on DeepStream 8.0 JetPack 7 (Thor) then. Thank you for the info!

We are using the same pipeline for both DeepStream 7.1 (AGX Orin) and DeepStream 8.0 (Jetson Thor)

Yes, we have added copy-hw=2 everywhere we were using NvVideoConvertin our pipeline on DeepStream 7.1 (AGX Orin) specifically as intructed

Hi @Fiona.Chen

Even after implementing the patch, the device still crashed (AGX Orin). I only log I was able to collect is:

[2025-11-27 02:11:24] topic=infer payload={"debug":"/dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(1420): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline0/GstNvInfer:infer","error":"Failed to queue input batch for inferencing"}

Earlier it used to crash every 30 minutes or 1 hour but this time it crashed after much later (4-hour) of stress testing.

I think there is another plugin that is poisoning the entire deepstream pipeline

I’ve tried the following pipeline for more than 4 hours on AGX Orin with JP6.2.1+DeepStream 7.1, it works well.

gst-launch-1.0 nvurisrcbin name=src_0 uri=rtsp://xxxxxx latency=200 drop-on-latency=false drop-frame-interval=1 rtsp-reconnect-interval=5 rtsp-reconnect-attempts=-1 smart-record=2 smart-rec-dir-path="videos" smart-rec-file-prefix="recording" smart-rec-cache=6 smart-rec-container=0 smart-rec-mode=1 smart-rec-default-duration=15 ! queue name=qsrc_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! tee name=tee_0 \ tee_0. ! queue name=qstream_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! mux.sink_0 \ nvstreammux name=mux batch-size=1 width=1920 height=1080 enable-padding=1 live-source=1 sync-inputs=0 batched-push-timeout=40000 buffer-pool-size=4 attach-sys-ts=1 \ ! nvinfer name=infer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.yml batch-size=1 \ ! nvtracker name=tracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_max_perf.yml \ ! tee name=t \ t. ! queue name=qtiler max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvmultistreamtiler name=tiler width=1920 height=1080 ! nvvideoconvert copy-hw=2 name=conv_tile ! nvdsosd name=osd ! nveglglessink name=sink sync=false \ t. ! queue name=qdemux max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvstreamdemux name=demux \ demux.src_0 ! queue name=qrecord_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! nvvideoconvert copy-hw=2 name=conv_0 ! nvdsosd name=osd_0 ! queue name=qdrain_0 max-size-time=200000000 max-size-bytes=0 max-size-buffers=0 leaky=downstream ! fakesink name=sink_0 sync=false async=false

Thank you @Fiona.Chen

I will stress test again and let you know the results here.

Hello @Fiona.Chen

I am confident that the issue is originating from the nvtracker plugin. I disabled nvtracker and ran the code for four days without any crashes. However, once I re-enabled nvtracker in the pipeline, the device crashed again within a few hours of stress testing. The nvtracker is the same from issue A and issue B.

We can temporarily disable the tracker in the code, but we would like to address the issue at the plugin level if possible.

Please let me know if you need any additional logs or further information from my side.

Thanks!