YOLO inference library issue

We are trying to run a deepstream pipeline twice back-to-back in the same process, but for some reason the first pipeline ends successfully but when the same pipeline starts second time it fails with SIGSEGV, segfault from within in libnvinfer.so

The call stack says that it is not able to reload the yolo inference library in the nvinfer. This issue might be related to reloading a library dynamically using the dlopen (ref: StackOverflow Any ideas on resolving this issue?

Call stack:

->libnvinfer.so.7![Unknown/Just-In-Time compiled code] (Unknown Source:0)

->libnvds_yolo.so!nvinfer1::PluginRegistrar<YoloLayerV3PluginCreator>::PluginRegistrar(nvinfer1::PluginRegistrar<YoloLayerV3PluginCreator> * const this) (/usr/include/x86_64-linux-gnu/NvInferRuntimeCommon.h:1309)

->libnvds_yolo.so!__static_initialization_and_destruction_0(int __initialize_p, int __priority) (/Workspace/geralt/src/inference/yoloPlugins.cpp:127)

->libnvds_yolo.so!_GLOBAL__sub_I_yoloPlugins.cpp(void)() (/Workspace/geralt/src/inference/yoloPlugins.cpp:127)

->ld-linux-x86-64.so.2!call_init(char ** env, char ** argv, int argc, struct link_map * l) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-init.c:72)

->ld-linux-x86-64.so.2!_dl_init(struct link_map * main_map, int argc, char ** argv, char ** env) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-init.c:119)

->ld-linux-x86-64.so.2!dl_open_worker(void * a) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-open.c:522)
libc.so.6!__GI__dl_catch_exception(struct dl_exception * exception, void (*)(void *) operate, void * args) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-error-skeleton.c:196)

->ld-linux-x86-64.so.2!_dl_open(const char * file, int mode, const void * caller_dlopen, Lmid_t nsid, int argc, char ** argv, char ** env) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-open.c:605)

->libdl.so.2!dlopen_doit(void * a) (/build/glibc-2ORdQG/glibc-2.27/dlfcn/dlopen.c:66)

->libc.so.6!__GI__dl_catch_exception(struct dl_exception * exception, void (*)(void *) operate, void * args) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-error-skeleton.c:196)

->libc.so.6!__GI__dl_catch_error(const char ** objname, const char ** errstring, _Bool * mallocedp, void (*)(void *) operate, void * args) (/build/glibc-2ORdQG/glibc-2.27/elf/dl-error-skeleton.c:215)

->libdl.so.2!_dlerror_run(void (*)(void *) operate, void * args) 
(/build/glibc-2ORdQG/glibc-2.27/dlfcn/dlerror.c:162)
libdl.so.2!__dlopen(const char * file, int mode) (/build/glibc-2ORdQG/glibc-2.27/dlfcn/dlopen.c:87)

->libnvds_infer.so!nvdsinfer::DlLibHandle::DlLibHandle(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) (Unknown Source:0)

->libnvds_infer.so!std::_MakeUniq<nvdsinfer::DlLibHandle>::__single_object std::make_unique<nvdsinfer::DlLibHandle, char (&) [4096], int>(char (&) [4096], int&&) (Unknown Source:0)

->libnvds_infer.so!nvdsinfer::NvDsInferContextImpl::initialize(_NvDsInferContextInitParams&, void*, void (*)(INvDsInferContext*, unsigned int, NvDsInferLogLevel, char const*, void*)) (Unknown Source:0)

->libnvds_infer.so!createNvDsInferContext(INvDsInferContext**, _NvDsInferContextInitParams&, void*, void (*)(INvDsInferContext*, unsigned int, NvDsInferLogLevel, char const*, void*)) (Unknown Source:0)

->libnvdsgst_infer.so!gst_nvinfer_start(_GstBaseTransform*) (Unknown Source:0)

->libgstbase-1.0.so.0![Unknown/Just-In-Time compiled code] (Unknown Source:0)

The call stack is as follows:
• Hardware Platform: T4
• DeepStream Version: 5.0 GA
• TensorRT Version: 7.0
• NVIDIA GPU Driver Version (valid for GPU only): 440.33.01

1 Like

Hi @geralt_of_rivia,
According to the failure log below, I guess it’s because you register the same YoloV3 TRT plugin twice.
If you only dlopen the YoloV3 plugin once, will this error happen?

->libnvinfer.so.7![Unknown/Just-In-Time compiled code] (Unknown Source:0)

->libnvds_yolo.so!nvinfer1::PluginRegistrar<YoloLayerV3PluginCreator>::PluginRegistrar(

Thanks!

We’re not opening the library, it’s happening from within nvinfer

I mean lib - libnvds_yolo.so , from the log, you called dlopen() to load it .
Can you only load it once?

How do I only load it once? I’m using the same code as object_detectionYOLO sample provided by DeepStream. The problem seems to be this line in yoloPlugins.cpp file:

REGISTER_TENSORRT_PLUGIN(YoloLayerV3PluginCreator);

I read the documentation for it but it doesn’t mention anywhere on how to load it only once.

It turns out that the REGISTER_TENSORRT_PLUGIN defines a static variable which errors out if loaded twice (for two pipelines back to back, for instance) I have solved this problem by removing that line from yoloPlugins.h file (which was loaded twice) to my main.cpp file.

2 Likes