Smart-record property killing the pipeline after a certain time

Please provide complete information as applicable to your setup.

**• Hardware Platform -------------> GPU
• DeepStream Version -----------> 7.0
• TensorRT Version --------------> 8.5
**• NVIDIA GPU Driver Version ----> 535.230.12

We are running 80 cams (4 processes 20 cameras) with L4 machine.After run for certain time the process got killed. Segmentation fault

Each camera we are getting 25 fps and and taking out data using pyds.

The moment we set the

uri_decode_bin = Gst.ElementFactory.make("nvurisrcbin", "nvurisrcbin")
base_path = f"/opt/nvidia/deepstream/deepstream-7.0/nvodin24/video/{index}_{uuid.uuid4().hex[:8]}"
uri_decode_bin.set_property("smart-record", 2)
os.makedirs(base_path, exist_ok=True)
uri_decode_bin.set_property("smart-rec-dir-path", base_path)
uri_decode_bin.set_property("smart-rec-cache", 20)

we start getting the critical error
(python3:35201): GStreamer-CRITICAL **: 07:41:30.684: gst_buffer_get_size: assertion ‘GST_IS_BUFFER (buffer)’ failed

after some time the process starting with coredump

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
--Type <RET> for more, q to quit, c to continue without paging--c
Core was generated by `python3 test_process.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=127620748412480) at ./nptl/pthread_kill.c:44
44  ./nptl/pthread_kill.c: No such file or directory.
[Current thread is 1 (Thread 0x741205600640 (LWP 1426874))]
(gdb) bt full 
#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=127620748412480) at ./nptl/pthread_kill.c:44
        tid = <optimized out>
        ret = 0
        pd = 0x741205600640
        old_mask = {__val = {243, 1613130407189148448, 127637175821872, 289, 98534734174688, 127620748412352, 2, 11, 98534734174688, 98534731873558, 98534731752894, 98534729810424, 0, 47244640256, 123, 0}}
        ret = <optimized out>
        pd = <optimized out>
        old_mask = <optimized out>
        ret = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
        resultvar = <optimized out>
        resultvar = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
        __futex = <optimized out>
        resultvar = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
        __futex = <optimized out>
        __private = <optimized out>
        __oldval = <optimized out>
        result = <optimized out>
#1  __pthread_kill_internal (signo=11, threadid=127620748412480) at ./nptl/pthread_kill.c:78
No locals.
#2  __GI___pthread_kill (threadid=127620748412480, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
No locals.
#3  0x00007415d896f476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
        ret = <optimized out>
#4  <signal handler called>
No locals.
#5  0x00007415d7cc7f17 in gst_buffer_get_size () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#6  0x00007415d7ccda68 in gst_buffer_copy_into () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#7  0x0000741557a651d6 in ?? () from /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libgstnvvideo4linux2.so
No symbol table info available.
#8  0x00007415d7d371d7 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#9  0x00007415d7bbf384 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:350
        task = 0x7410dc00b620
        pool = <optimized out>
#10 0x00007415d7bbeac1 in g_thread_proxy (data=0x7414e80291a0) at ../glib/gthread.c:831
        thread = 0x7414e80291a0
        __func__ = "g_thread_proxy"
#11 0x00007415d89c1ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
        ret = <optimized out>
        pd = <optimized out>
        out = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {127634990168848, -1021380715546781404, 127620748412480, 0, 127637177243600, 127634990169200, 1871062135877294372, 1871494816171579684}, 
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#12 0x00007415d8a53850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.



please help us out fix the issue.
@Fiona.Chen @fanzh @junshengy

Are you testing in docker? which sample are you testing or referring to? what is the complete media pipeline? did all four processes crash at the same time or only one crash? how long did it run before getting killed? if disabling smart-record, will the app crash? Thanks!

  1. We are using “Docker”
  2. It’s our code, kind of deepstream-test3
  3. nvurisrcbin -->streammux—>queue —>infer—>tracker---->analytics ---->queue —>appsink
  4. No, one process getting killed after certain time one more getting killed
  5. It’s 45min one process is getting down
  6. Without smart-record all process is running

You can see above two graph when process got killed their is drop in memory and also at the same instance spike in I/O Wait

@snehashish.debnath @s.Jagannath

From the crash stack, it is related to decoder. what are the resolution, fps and video codec of rtsp source? could you share some logs of “nvidia-smi dmon” when running the app? wondering the utilization of decoding.

Frame size is (1920,1080) —> 25 FPS
video codec is H265
Attaching our nvidia-smi dmon stats
nvidia_smart_record_dmon.zip (10.2 KB)

Pls check the logs for GPU 0

Observations:- Using person model
For low Fps cameras this issue doesn’t happen (Not Sure).
When the FPS is 25 and there are a lot of objects like min 20~30 object in per frame per camera its more evident.

What do u think can be the issue ?
L4_output_gdb.zip (74.4 MB)

Giving you gdb output also inside this you can see element wise latency !

AYK, smart-reocrd is used to record the encoded stream. many objects is not related to smart-record.
what did you do in appsink? To narrow down this issue, please use fakesink instead.
many objects will affect the GPU utilizaton of inference and tracker. As of now, there is no GPU utilzatision log when the app crashes. could you run “nvidia-smi dmon -o T >1.log” when running the app? then please share the 1.log and app log when the app crashes.

@fanzh
We will share with you 1.log. Could you clarify what information contain app log ?

the applicaton running log on the terminal, wondering if there are some error tips.

Hi @fanzh

Currently we are running with “fakesink” instate of “appsink”. But same, process kill is happening after certain time.
I am attaching zip file, file contain 1.log, docker stats graph, python code for pipeline,and related files which will help to run the code, Could you check the code and reproduce the issue from your end ?

It will be really great full if you point out where are we doing mistake with respect to the pipeline !

nvidia_forums_code_24Mar.zip (5.0 MB)

Thanks for the sharing! there is no the applicaton running log, we don’t know when the applicaton was killed. From the log of nvidia-smi, sometime the GPU utilization is close to 100%. please refer to this link for performance improvement.

01:01:57      1     66     76      -     99     94      0     64      0      0   6250   1425
01:05:14      0     64     69      -    100     75      0     50      0      0   6250   1545
01:29:05      0     63     69      -     97     93      0     69      0      0   6250   1365 
02:28:44      0     63     69      -     97     94      0     63      0      0   6250   1320 

As we have provided you the docker stats graph, if you notice properly after a day their is a certain drop in the docker memory, so what we have seen is that drop corresponds to killing of the pipeline and we getting core dump issue. I am attaching core dump timestamp bellow picture

Again please try to reproduce the issue your end with the provided code. We will be attaching the app log soon.

Even if you say GPU utilisation is close to 100%, let not forget this issue only comes when smart-record is enable and in the 1.log 100 came only one time, so we negate it !

ERROR :-

Program terminated with signal SIGSEGV, Segmentation fault.
#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140696631641664) at ./nptl/pthread_kill.c:44
44  ./nptl/pthread_kill.c: No such file or directory.
[Current thread is 1 (Thread 0x7ff67cc00640 (LWP 1644217))]
(gdb) bt full 
#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140696631641664) at ./nptl/pthread_kill.c:44
        tid = <optimized out>
        ret = 0
        pd = 0x7ff67cc00640
        old_mask = {__val = {2175, 9266090568899942233, 140737349464432, 349, 93824997943776, 140696631641536, 2, 11, 93824997943776, 93824995642646, 93824995521982, 93824993579512, 0, 47244640281, 
            18446744073709551615, 0}}
        ret = <optimized out>
        pd = <optimized out>
        old_mask = <optimized out>
        ret = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
        resultvar = <optimized out>
        resultvar = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
        __futex = <optimized out>
        resultvar = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
        __futex = <optimized out>
        __private = <optimized out>
        __oldval = <optimized out>
        result = <optimized out>
#1  __pthread_kill_internal (signo=11, threadid=140696631641664) at ./nptl/pthread_kill.c:78
No locals.
#2  __GI___pthread_kill (threadid=140696631641664, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
No locals.
#3  0x00007ffff7c92476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
        ret = <optimized out>
#4  <signal handler called>
No locals.
#5  0x00007fff8baffebc in gst_buffer_copy_into () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#6  0x00007fff90c461d6 in ?? () from /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libgstnvvideo4linux2.so
No symbol table info available.
#7  0x00007fff8bb691d7 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#8  0x00007fff90142384 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:350
        task = 0x7ff3884a3050
        pool = <optimized out>
#9  0x00007fff90141ac1 in g_thread_proxy (data=0x7ff9f0041650) at ../glib/gthread.c:831
        thread = 0x7ff9f0041650
        __func__ = "g_thread_proxy"
#10 0x00007ffff7ce4ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
        ret = <optimized out>
        pd = <optimized out>
        out = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140711601109776, -4543113795256716698, 140696631641664, 0, 140737350879184, 140711601110128, 4548467867175875174, 4543130980051448422}, 
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#11 0x00007ffff7d76850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Now I’m attaching app log also with you
219_gdb_ouput_24Mar.zip (84.3 MB)

Hi @Fiona.Chen

I hope you are doing well. We are facing above problem as you have L4 machine, we are also using L4,
could you please check the code and run it on our machine reproduce the issue ? It’s very important for us to know !
I am sharing all the necessary file which require to run the code !
nvidia_24march_all_logs.zip (3.5 MB)

Thank you.

let’s narrow down this issue first.

  1. From the crash stack, the app crashed in decoder libgstnvvideo4linux2.so. From the nvidia-smi, sometimes the GPU utilization is close to 100%, which is abnormal. If disabling infer, tracker, analytics, will the issue persist?
  2. If you suppose smart-record is the root cause, to narrow down this issue, could you check if the pipeline “nvurisrcbin —>fakesink” with smart-record will crash.
1 Like

We understand this are following results the experiment you suggested (nvurisrcbin —>streammux —>fakesink)

  1. After disabling infer,tracker , analytics the GPU utilisation was never close to 100.
  2. As you pointed out to understand smart-record is the root coz we ran above pipeline and it crash.

Below are the app log, 1.log , docker stats visualisation graph.


L4_output_gdb_25Mar.zip (9.7 MB)

Do you mean some processes still crashed when using “nvurisrcbin —>streammux —>fakesink”? if so, could you share test code? I will try to reproduce. is the crash stack the same with the stack in the issue description?

1 Like
  1. Yes, some processes still crashed when using “nvurisrcbin —>streammux —>fakesink".
  2. Pls find the below attached code below.
  3. Yes the crash stack are the same as in the issue description. Please look in the core-dump if it crash for u to.

nvurisrcbin_to_fakesink.zip (5.3 KB)

  1. are you testing in DS7.0 docker container? the TRT verson of DS7.0 should be 8.6.1. please make sure the components version meet the requirement of this table.

  2. In the shared code simple_pipeline.py, Noticing the code did not call start-sr to start recording, please make sure pyds v1.1.11 is installed for DS7.0.

  3. to rull out the smart record, could you remove the following code in simple_pipeline.py to check if the app still will crash? the default smart-record model is 0(disabled).

    uri_decode_bin.set_property("smart-record", 2)
    os.makedirs(base_path, exist_ok=True)
    uri_decode_bin.set_property("smart-rec-dir-path", base_path)
    uri_decode_bin.set_property("smart-rec-cache", 20)

1 and 2. I think all the dependency are as it should be attaching a screenshot for the same.

  1. I think we are over looking the fact that the moment we are setting the smart-record property in the simple_pipeline.py the crash happens . If we don’t set smart-record property i.e if the code is not there ( 1. nvurisrcbin -->streammux—>queue —>infer—>tracker---->analytics ---->queue —>fakesink/appsink) it does not fail.

Any ways we will do to rull out the smart record, could you remove the following code in simple_pipeline.py to check if the app still will crash? the default smart-record model is 0(disabled). as u suggested

  1. Are u also facing the same issue when u reproduce?

duplicate with this topic.
after you tested “nvurisrcbin —>streammux —>fakesink” with smart record, Please provide kernel log and docker service log to determine why the program was killed. Thanks!

sudo dmesg  > kernel.log
sudo journalctl -u docker.service > docker.log