Custom Gst-nvinferserver post processing received wild pointer resulting in signal11

noblehill · December 6, 2022, 10:01am

Can you send me the log screenshots of test3 in full operation? Which plug-in reports this error

fanzh · December 6, 2022, 10:04am

here is the whole log:
./deepstream-test3-app 10 rtsp://xx
ERROR: Could not open lib: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/models, error string: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/models: cannot open shared object file: No such file or directory
0:00:00.281571616 251492 0xaaab0d440f30 ERROR nvinferserver gstnvinferserver.cpp:405:gst_nvinfer_server_logger: nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:68> [UID = 1]: Could not open custom lib: (null)
0:00:00.281616673 251492 0xaaab0d440f30 WARN nvinferserver gstnvinferserver_impl.cpp:588:start: error: Failed to initialize InferTrtIsContext
0:00:00.281625985 251492 0xaaab0d440f30 WARN nvinferserver gstnvinferserver_impl.cpp:588:start: error: Config file path: models/Runmodels/mutilModelsA/config_inferserver_0.txt
0:00:00.281701282 251492 0xaaab0d440f30 WARN nvinferserver gstnvinferserver.cpp:507:gst_nvinfer_server_start: error: gstnvinferserver_impl start failed
startRunning…
ERROR from element primary-nvinference-engine: Failed to initialize InferTrtIsContext
Error details: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinferserver/gstnvinferserver_impl.cpp(588): start (): /GstPipeline:dstest3-pipeline/GstNvInferServer:primary-nvinference-engine:
Config file path: models/Runmodels/mutilModelsA/config_inferserver_0.txt
Returned, stopping playback
Deleting pipeline

noblehill · December 6, 2022, 11:20am

I just put the code I sent before into test3 and compiled it to run successfully
This is my catalog

1670325580898

config_inferserver_0.txt
Configuring an Absolute Path

  custom_lib {
    path: "/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so"
  }

    labelfile_path: "/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/labels.txt"

g_object_set (G_OBJECT (pgie),"config-file-path", "/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/config_inferserver_0.txt", NULL);

noblehill · December 6, 2022, 11:28am

This is my log

root@localhost:/opt/nvidia/deepstream/deepstream-6.1/bin# ./deepstream-test3-app 1 rtsp://

INFO: infer_grpc_backend.cpp:164 TritonGrpcBackend id:1 initialized for model: mutilModelsA
Decodebin child added: source
Running...
Decodebin child added: decodebin0
Decodebin child added: rtph264depay0
Decodebin child added: h264parse0
Decodebin child added: capsfilter0
Decodebin child added: nvv4l2decoder0
Frame Number = 0 
Frame Number = 1 
Frame Number = 2 
Frame Number = 3 
Frame Number = 4 
Frame Number = 5

fanzh · December 7, 2022, 8:21am

it can run after modifying config_inferserver_0.txt, I can’t run 50 RTSP sources because of RTSP error, here is test report:
using physical RTSP camera, 10 streams are ok, if using more sources there will be rtsp errror “Internal Server Error (500)”.
using virtual RTSP camera, 20 rtsp sources are ok, if using more sources there will be rtsp errror.
here are some questions:

what is your jetson model? what is your rtsp source info？ physical or virtual, resolution, fps.
dose it run ok with 10 or 20 RTSP sources?
please check if the pointer address is valid, you can print the pointer address when accessing pointer content.
dose your model support dynamical batch? if yes, batch inference can improve performance.

noblehill · December 7, 2022, 9:03am

1.jetson model: Jetson AGX Orin rtsp source info: virtual 1920*1080 25fps
2.This problem still occurs when I run 20 rtsp, 10 rtsp I haven’t tested yet
3.I can print the address of the pointer every time, but print the content is going to crash, which is why I say it’s a wild pointer instead of a null pointer
4.My model does not support dynamic batch processing because I find the program unstable when the max-batch is not 1

fanzh · December 8, 2022, 3:03am

Using Orin and 1080p 25fps RTSP source, I can’t reproduce this error, here is test result:
30sources.txt (523.1 KB)

can you print the pointer address like this: printf(“p:%p\n”, p + i)? by this way, we can check if the address is valid.
from “30sources.txt”, I can run more than 800 frames, how many frames can the application run on your machine?
can you use gdb to debug it ? what is the crashing stack?
if emptying the decodeYoloV5_3_Tensor, will it crash again? want to know if it is related to other modules.

noblehill · December 8, 2022, 11:23am

This is the log information of error print when I run 25 channels. I printed out the stack when the error occurred。 * It was a problem with 4180 frames。 The problem is only sporadic and you need to run longer。
25 rtsp.txt (4.8 KB)

This is the code after I added the stack print

deepstream_test3_app.c (14.8 KB)

fanzh · December 12, 2022, 2:53pm

After testing and ./deepstream-test3-app 25 rtsp://127.0.0.1:8554/test
./deepstream-test3-app 30 rtsp://127.0.0.1:8554/test, I can’t reproduce this issue, here are the logs:
25.txt (4.1 MB)
30.txt (4.9 MB)
from the stack, the other thread also received "signal 11 ", if emptying the decodeYoloV5_3_Tensor, will it crash again? want to know if it is related to other modules.

fanzh · December 15, 2022, 8:24am

after application crashed, can you provide system log by “sudo /usr/bin/nvidia-bug-report-tegra.sh -o system.log”?

noblehill · December 19, 2022, 9:03am

This is the log file obtained by executing the command after the crash
system.log (67.9 MB)
The stack information is as follows

Frame Number = 4290 
Frame Number = 4291 
Frame Number = 4288 
Frame Number = 4291 
Frame Number = 4207 
Frame Number = 4293 
Frame Number = 4293 
recv a signal 11 and exit
deepstream-test3-app(+0x2260) [0xaaaac9642260]
linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffffb6fc07c0]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(+0x13c18) [0xffffa37d8c18]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(NvDsInferParseCustomYoloV5_3_Out+0x24c) [0xffffa37da26c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10a4d8) [0xffffa9b2b4d8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10b754) [0xffffa9b2c754]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10ad8c) [0xffffa9b2bd8c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe5ce8) [0xffffa9b06ce8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe6490) [0xffffa9b07490]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe9ca8) [0xffffa9b0aca8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xff2bc) [0xffffa9b202bc]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100a00) [0xffffa9b21a00]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100d20) [0xffffa9b21d20]
/lib/aarch64-linux-gnu/libstdc++.so.6(+0xccfac) [0xffffb4dccfac]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x7624) [0xffffb699a624]
/lib/aarch64-linux-gnu/libc.so.6(+0xd149c) [0xffffb6b8149c]

AIService exit

noblehill · December 19, 2022, 9:26am

If you empty decodeYoloV5_3_Tensor and don’t print float* pBuf = (float*)(layer.buffer); std::cout <<“*pBuf:” <<*pBuf<<std::endl; I have run the program for a long time without any errors.

fanzh · December 20, 2022, 2:47pm

from the system log, there is some " nvidia-desktop CRON[1656741]: pam_unix(cron:session): session closed for user root" on 12.19, did you observe the session was restarted?
can you print that offset + SCORE? check if it is a valid value before crash? check if it crashed at a fixed value?

noblehill · December 23, 2022, 2:21am

I did not observe a session restart
2.Because the SCORE enumeration value is fixed at 4, I can just print the value of offset. I printed the value of offset at the beginning, and every time when it was fixed at 0, there was a problem. I tried to skip it when offset=0, and there was a problem at the next value.

the log:

offset=80416
offset=80444
offset=80472
offset=80500
offset=80528
offset=80556
offset=80584
offset=80612
offset=0
recv a signal 11 and exit
./deepstream-test3-app(+0x2260) [0xaaaab3922260]
linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8341a7c0]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(+0x13c48) [0xffff73c55c48]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(NvDsInferParseCustomYoloV5_3_Out+0x24c) [0xffff73c5729c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10a4d8) [0xffff75f854d8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10b754) [0xffff75f86754]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10ad8c) [0xffff75f85d8c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe5ce8) [0xffff75f60ce8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe6490) [0xffff75f61490]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe9ca8) [0xffff75f64ca8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xff2bc) [0xffff75f7a2bc]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100a00) [0xffff75f7ba00]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100d20) [0xffff75f7bd20]
/lib/aarch64-linux-gnu/libstdc++.so.6(+0xccfac) [0xffff81226fac]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x7624) [0xffff82df4624]
/lib/aarch64-linux-gnu/libc.so.6(+0xd149c) [0xffff82fdb49c]

fanzh · December 27, 2022, 10:11am

from your test, it only crashed at offset=0, can you print the pointer address like this: printf(“p:%p\n”, p + i) when offset=0? by this way, we can check if the address is valid before crashed.
can you use gdb to check the pointer value and point content after crashed?

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

system · January 13, 2023, 3:02am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
In jetson, wild Pointers appeared in the interactive data in the GRPC mode of Deepstream6.1 and tritonserver DeepStream SDK	5	480	November 29, 2022
Program terminated with signal SIGSEGV when put nvinfer between nvstreammux and nvstreamdemux DeepStream SDK	1	1196	September 12, 2018
Segmentation Error DeepStream SDK	6	2748	October 29, 2018
rtsp inference Jetson Nano	10	3025	October 18, 2021
ERROR: something wrong with flag 'network_type' in yolo_config_parser.cpp One possibility: file 'deepstream/deepstream_reference_apps/yolo/lib/yolo_config_parser.cpp DeepStream SDK	27	1751	October 12, 2021
Streaming stopped, reason not-linked (-1) Segmentation fault (core dumped) DeepStream SDK	2	4456	October 12, 2021
Getting a black video on running a sample deepstream application DeepStream SDK	36	1895	April 26, 2019
Camera: DPB and MjstreamingSocket read error Jetson TX1	33	5101	September 5, 2016
Bus error while running deepstream refrerence app DeepStream SDK	13	1785	October 12, 2021
add nvtracker in deepstream_test3_app have Display problem！ DeepStream SDK	3	1868	October 12, 2021

Custom Gst-nvinferserver post processing received wild pointer resulting in signal11

Related topics