I’m porting some code from Python to C/C++ and I’m running into an issue. The python code is multithreaded. In the main thread the code captures an image from a VideoSource object and then passes it to one or more worker threads, each doing a different kind of processing on the image. The first thread I’m porting runs the captured image through a PoseNet object for pose detection. The python code works fine, but in C/C++ whenever I call Process on the PoseNet object I get a segmentation fault.
I ran it through gdb and got the following stack trace from the crash:
Thread 14 “uatu_george” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f6111fa10 (LWP 12272)]
0x0000007fb7da0e2c in tensorNet::PROFILER_BEGIN(profilerQuery) ()
from /usr/local/lib/libjetson-inference.so
(gdb) where #0 0x0000007fb7da0e2c in tensorNet::PROFILER_BEGIN(profilerQuery) ()
at /usr/local/lib/libjetson-inference.so #1 0x0000007fb7dacab0 in poseNet::Process(void*, unsigned int, unsigned int, imageFormat, std::vector<poseNet::ObjectPose, std::allocatorposeNet::ObjectPose >&, unsigned int) () at /usr/local/lib/libjetson-inference.so #2 0x0000005555562fd0 in poseNet::Process(float4*, unsigned int, unsigned int, std::vector<poseNet::ObjectPose, std::allocatorposeNet::ObjectPose >&, unsigned int) (this=0x0, image=0x100e60000, width=1280, height=720, poses=std::vector of length 0, capacity 0, overlay=4)
at /usr/local/include/jetson-inference/poseNet.h:230 #3 0x0000005555562874 in clsWorkerPoseDetection::execute() (this=0x55555e2060 )
at /home/marc/src/george/device_jetson/cpp/src/workerPoseDetection.cpp:111 #4 0x00000055555623c8 in baseEntry(void*) (arg=0x55555e2060 )
at /home/marc/src/george/device_jetson/cpp/src/workerBase.cpp:5 #5 0x0000007fb7f6b088 in start_thread (arg=0x7fffffeb8f)
at pthread_create.c:463 #6 0x0000007fb77770cc in thread_start ()
at …/sysdeps/unix/sysv/linux/aarch64/clone.S:78
Is there something special I have to do to make this work in multiple threads? I thought perhaps the float4* wasn’t safe to use between threads so I tried cudaMemcpy() to copy it into a buffer owned by the thread object but that didn’t work either.
Hi @mjasner, that call to PROFILE_BEGIN() is the first thing really run inside poseNet::Process() - are you sure your poseNet object is valid? I don’t see where pNet pointer is created at.
Well it turns out I’m apparently an idiot. The pNet pointer is allocated in the constructor but sure enough it’s not working and so the pNet pointer is NULL. I’ll fix that and hide my head in shame at missing the obvious… sorry about that. Thanks for all the help.
I don’t see where that sets the pNet pointer, it sets a local variable (net) that goes out of scope after the constructor ends. Regardless, these multi-threaded issues are difficult to debug from just looking at the source - I would recommend stripping things down, starting with running things as single-threaded and slowly introducing the threaded aspects or add additional logging to try and detect the issues.
Yep, when I moved things into the class structure I made copy/paste mistakes… Sorry. When I was debugging I checked EVERY pointer object apparently EXCEPT pNet. shakes head sadly
OK, no worries at all! A couple of other things I noticed:
you are processing imgBuf with poseNet, but presumably that is a blank image buffer because it’s never copied over from the input frame. I think it should just be able to process the input frame directly without needing imgBuf
the frames that come from videoSource are stored in a ringbuffer, so those get re-used. If necessary you can increase the number of frames in the ringbuffer in the videoOptions struct. Otherwise if your processing takes too long, videoSource can begin overwriting a frame while you are still using it (IIRC the default number of buffers is 4)
I think overlayFlags = poseNet::OverlayFlagsFromStr("overlay links,keypoints"); should be overlayFlags = poseNet::OverlayFlagsFromStr("links,keypoints"); instead
Thanks for the tips. The imgBuf thing is because initially I thought there was some issue processing the float4* and so I was going to copy the float4 into it. I realized that wasn’t the issue so stopped there. I have cleanup work to do.