Pyds.get_nvds_buf_surface segmenation fault

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) AGX
• DeepStream Version 5.0
• JetPack Version (valid for Jetson only) 4.4
• TensorRT Version 7.0
• NVIDIA GPU Driver Version (valid for GPU only)

I’m building a short buffer of frames using the python pyds.get_nvds_buf_surface function that can be emited when a specific sequence of events is seen.

When using the function on the first few frame of a stream I recieved a segmenation fault.

This problem does not occur with the same code on the Intel platform.

A work around - After skipping copying the first 4 frames of the stream the process runs stabily.

David

2 Likes

@dnewton

Hello, is it okay to provide us your python script, your configuration files and your logs?
You can use export GST_DEBUG=<num_of_log_level> to get more logs.

Thank you very much.

Please find logs attachedlog.txt (2.3 KB) gts_log.txt (75.9 KB)

where is the segmenation fault log? Manybe you access invalid data in the first 4 frames?

I have run in gdb and attached the log.
If I skip reading the first 4 frames of the stream I have no problem copying the images.
gdb.log (5.5 KB)

Hi @dnewton,
Is it possible to provide the repo?

Thanks!

Hi @dnewton,
ping… thanks!

Sorry - was on holiday.
Please find an example attached based on deepstream-imagedata-multistream
It run on intel but fails on AGX.
To create the mjpeg input file please run the convert script
To test run
“python3 deepstream_imagedata-test.py test.mjpeg frames”
test.zip (5.0 KB)

The same problem happened to me.
The skip workaround worked as well.

Hardware Platform (Jetson / GPU) Jetson Nano
• DeepStream Version 5.01
• JetPack Version (valid for Jetson only) 4.4

For some reason, it began to fail no matter how many frames are skipped.

I confirmed the segfault is caused at exactly the same location as the original post.

Thread 7 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f71fff1f0 (LWP 29041)]
0x0000007f8991d6d8 in ?? () from /usr/lib/aarch64-linux-gnu/libpixman-1.so.0
(gdb) bt
#0  0x0000007f8991d6d8 in  () at /usr/lib/aarch64-linux-gnu/libpixman-1.so.0
#1  0x0000007f71ffd640 in  ()

yes, I can reproduce this issue with sample @dnewton provided, but didn’t get chance to debug it.
WIll find time to look into it.
Any informantion that can help debug is apprecaited!

Thanks!

1 Like

I suspect the downstream nvdsosd is the direct cause of segfault.
When it is replaced with fakesink, I haven’t observed the segfault.
Also, it is only triggered when nvinfer detects something, even though I don’t directly manipulate object metadata it generates.

@mchi Encountered the same problem when tried to use pyds.get_nvds_buf_surface inside a chainfunc of a custom Element (just a minor modification of deepstream-imagedata-multistream.py). I tried to replicate the same pipeline used in deepstream-imagedata-multistream.py in my own example with no success. Couldn’t debug after several hours.

@niboshi000 I didn’t use nvosd in my pipeline and used fakesink after the element in which I used pyds.get_nvds_buf_surface, but the error persisted.

1 Like

more information

On start only skipping the copy of the first 4 frames is unreliable - 16 frame is better.

Unfortunately we are also suffering crashes after the device has been running hours/days.

In production we are not running in the debugger hence I cannot say if it’s the same place/module but since the SEGV faults only started after we buffer the frames i doubt it’s a coincident.

While reliability is critical can you raise the priority here.

Thanks

David

Hi @niboshi000, @Aref, @dnewton,
Based on @dnewton provided repo on Sep 26, if I replaced the tiler_sink_pad_buffer_probe() with the tiler_sink_pad_buffer_probe() from deepstream-imagedata-multistream/deepstream_imagedata-multistream.py, the issue is gone. Attached deepstream_imagedata-test.zip (4.6 KB) is the modified file.
Could you tell me what the change is for, i.e why do you need to to change tiler_sink_pad_buffer_probe() ?

Thanks!

In the modified tiler_sink_pad_buffer_probe I’m building a short buffer of frames using the python pyds.get_nvds_buf_surface function so that they all can be emited when a specific sequence of events is seen.

test_modified.zip (29.8 KB)
Please try attached modified code, it modified the pipeline before nvstreammux as below.
Can run “git diff” under the unzip folder to check the modification.
Let me know if it works. Thanks!

Orig:
filesrc ! multipartdemux ! jpegparse ! jpdec ! nvvideoconvert ! “video/x-raw(memory:NVMM), format=RGBA” ! nvstreammux ! …
After modfication:
filesrc ! multipartdemux ! jpegparse ! nvv4l2decoder mjpeg=1 ! nvstreammux ! …

Sorry still fails with the same SEGV fault

Thread 11 “python3” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f74f271f0 (LWP 8663)]
0x0000007fb08766d8 in ?? () from /usr/lib/aarch64-linux-gnu/libpixman-1.so.0
(gdb)

I was thinking it should be possible to what I’m doing in a appsink (instead of a probe) but I don’t know how to get to the
meta data (gst_buffer_get_nvds_batch_meta) from appsink in python

David

But I can’t see the segment fault with my modified code.
Did you just run my code or make any change on it?

Your example code worked.
Initially I tried the simply converting my app code (using nvv4l2decoder) and it failed hence I wondered why.
I added the nvvideoconvert step back into your modified example and then it also failed with the SEGV.
The nvvideoconvert is necessary to crop the video.
When I just skip buffering the first 16 frame the programs runs through. Unfortunatly this dirty fix does not appear to be stable.