Custom Gst-nvinferserver post processing received wild pointer resulting in signal11

Can you send me the log screenshots of test3 in full operation? Which plug-in reports this error

here is the whole log:
./deepstream-test3-app 10 rtsp://xx
ERROR: Could not open lib: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/models, error string: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/models: cannot open shared object file: No such file or directory
0:00:00.281571616 251492 0xaaab0d440f30 ERROR nvinferserver gstnvinferserver.cpp:405:gst_nvinfer_server_logger: nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:68> [UID = 1]: Could not open custom lib: (null)
0:00:00.281616673 251492 0xaaab0d440f30 WARN nvinferserver gstnvinferserver_impl.cpp:588:start: error: Failed to initialize InferTrtIsContext
0:00:00.281625985 251492 0xaaab0d440f30 WARN nvinferserver gstnvinferserver_impl.cpp:588:start: error: Config file path: models/Runmodels/mutilModelsA/config_inferserver_0.txt
0:00:00.281701282 251492 0xaaab0d440f30 WARN nvinferserver gstnvinferserver.cpp:507:gst_nvinfer_server_start: error: gstnvinferserver_impl start failed
startRunning…
ERROR from element primary-nvinference-engine: Failed to initialize InferTrtIsContext
Error details: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinferserver/gstnvinferserver_impl.cpp(588): start (): /GstPipeline:dstest3-pipeline/GstNvInferServer:primary-nvinference-engine:
Config file path: models/Runmodels/mutilModelsA/config_inferserver_0.txt
Returned, stopping playback
Deleting pipeline

I just put the code I sent before into test3 and compiled it to run successfully
This is my catalog

1670325580898

config_inferserver_0.txt
Configuring an Absolute Path

  custom_lib {
    path: "/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so"
  }

    labelfile_path: "/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/labels.txt"
   
g_object_set (G_OBJECT (pgie),"config-file-path", "/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/Runmodels/mutilModelsA/config_inferserver_0.txt", NULL);
   

This is my log

root@localhost:/opt/nvidia/deepstream/deepstream-6.1/bin# ./deepstream-test3-app 1 rtsp://

INFO: infer_grpc_backend.cpp:164 TritonGrpcBackend id:1 initialized for model: mutilModelsA
Decodebin child added: source
Running...
Decodebin child added: decodebin0
Decodebin child added: rtph264depay0
Decodebin child added: h264parse0
Decodebin child added: capsfilter0
Decodebin child added: nvv4l2decoder0
Frame Number = 0 
Frame Number = 1 
Frame Number = 2 
Frame Number = 3 
Frame Number = 4 
Frame Number = 5

it can run after modifying config_inferserver_0.txt, I can’t run 50 RTSP sources because of RTSP error, here is test report:
using physical RTSP camera, 10 streams are ok, if using more sources there will be rtsp errror “Internal Server Error (500)”.
using virtual RTSP camera, 20 rtsp sources are ok, if using more sources there will be rtsp errror.
here are some questions:

  1. what is your jetson model? what is your rtsp source info? physical or virtual, resolution, fps.
  2. dose it run ok with 10 or 20 RTSP sources?
  3. please check if the pointer address is valid, you can print the pointer address when accessing pointer content.
  4. dose your model support dynamical batch? if yes, batch inference can improve performance.

1.jetson model: Jetson AGX Orin rtsp source info: virtual 1920*1080 25fps
2.This problem still occurs when I run 20 rtsp, 10 rtsp I haven’t tested yet
3.I can print the address of the pointer every time, but print the content is going to crash, which is why I say it’s a wild pointer instead of a null pointer
4.My model does not support dynamic batch processing because I find the program unstable when the max-batch is not 1

Using Orin and 1080p 25fps RTSP source, I can’t reproduce this error, here is test result:
30sources.txt (523.1 KB)

  1. can you print the pointer address like this: printf(“p:%p\n”, p + i)? by this way, we can check if the address is valid.
  2. from “30sources.txt”, I can run more than 800 frames, how many frames can the application run on your machine?
  3. can you use gdb to debug it ? what is the crashing stack?
  4. if emptying the decodeYoloV5_3_Tensor, will it crash again? want to know if it is related to other modules.

This is the log information of error print when I run 25 channels. I printed out the stack when the error occurred。 * It was a problem with 4180 frames。 The problem is only sporadic and you need to run longer。
25 rtsp.txt (4.8 KB)

  • This is the code after I added the stack print

deepstream_test3_app.c (14.8 KB)

  1. After testing and ./deepstream-test3-app 25 rtsp://127.0.0.1:8554/test
    ./deepstream-test3-app 30 rtsp://127.0.0.1:8554/test, I can’t reproduce this issue, here are the logs:
    25.txt (4.1 MB)
    30.txt (4.9 MB)
  2. from the stack, the other thread also received "signal 11 ", if emptying the decodeYoloV5_3_Tensor, will it crash again? want to know if it is related to other modules.

after application crashed, can you provide system log by “sudo /usr/bin/nvidia-bug-report-tegra.sh -o system.log”?

  • This is the log file obtained by executing the command after the crash
    system.log (67.9 MB)

  • The stack information is as follows

Frame Number = 4290 
Frame Number = 4291 
Frame Number = 4288 
Frame Number = 4291 
Frame Number = 4207 
Frame Number = 4293 
Frame Number = 4293 
recv a signal 11 and exit
deepstream-test3-app(+0x2260) [0xaaaac9642260]
linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffffb6fc07c0]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(+0x13c18) [0xffffa37d8c18]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(NvDsInferParseCustomYoloV5_3_Out+0x24c) [0xffffa37da26c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10a4d8) [0xffffa9b2b4d8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10b754) [0xffffa9b2c754]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10ad8c) [0xffffa9b2bd8c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe5ce8) [0xffffa9b06ce8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe6490) [0xffffa9b07490]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe9ca8) [0xffffa9b0aca8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xff2bc) [0xffffa9b202bc]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100a00) [0xffffa9b21a00]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100d20) [0xffffa9b21d20]
/lib/aarch64-linux-gnu/libstdc++.so.6(+0xccfac) [0xffffb4dccfac]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x7624) [0xffffb699a624]
/lib/aarch64-linux-gnu/libc.so.6(+0xd149c) [0xffffb6b8149c]

AIService exit

If you empty decodeYoloV5_3_Tensor and don’t print float* pBuf = (float*)(layer.buffer); std::cout <<“*pBuf:” <<*pBuf<<std::endl; I have run the program for a long time without any errors.

  1. from the system log, there is some " nvidia-desktop CRON[1656741]: pam_unix(cron:session): session closed for user root" on 12.19, did you observe the session was restarted?
  2. can you print that offset + SCORE? check if it is a valid value before crash? check if it crashed at a fixed value?
  1. I did not observe a session restart
    2.Because the SCORE enumeration value is fixed at 4, I can just print the value of offset. I printed the value of offset at the beginning, and every time when it was fixed at 0, there was a problem. I tried to skip it when offset=0, and there was a problem at the next value.

the log:

offset=80416
offset=80444
offset=80472
offset=80500
offset=80528
offset=80556
offset=80584
offset=80612
offset=0
recv a signal 11 and exit
./deepstream-test3-app(+0x2260) [0xaaaab3922260]
linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8341a7c0]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(+0x13c48) [0xffff73c55c48]
/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-test3/models/plugins/libnvdsinfer_custom_impl_Yolo.so(NvDsInferParseCustomYoloV5_3_Out+0x24c) [0xffff73c5729c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10a4d8) [0xffff75f854d8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10b754) [0xffff75f86754]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x10ad8c) [0xffff75f85d8c]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe5ce8) [0xffff75f60ce8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe6490) [0xffff75f61490]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xe9ca8) [0xffff75f64ca8]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0xff2bc) [0xffff75f7a2bc]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100a00) [0xffff75f7ba00]
/opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer_server.so(+0x100d20) [0xffff75f7bd20]
/lib/aarch64-linux-gnu/libstdc++.so.6(+0xccfac) [0xffff81226fac]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x7624) [0xffff82df4624]
/lib/aarch64-linux-gnu/libc.so.6(+0xd149c) [0xffff82fdb49c]
  1. from your test, it only crashed at offset=0, can you print the pointer address like this: printf(“p:%p\n”, p + i) when offset=0? by this way, we can check if the address is valid before crashed.
  2. can you use gdb to check the pointer value and point content after crashed?

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.