Pthreads and cudaMalloc()

Hello,

I am using the pthread library to implement multithreading.

I am in the process of moving my functions that use detectNet to threads so I can handle network communication tasks in the background. The threads do not need to share memory allocated by cudaMalloc().

In the process of doing so, I am receiving a segmentation fault on cudaMalloc().

I’ve read a bit online and found that cudaMalloc() is thread safe.

Below, is a snippet of cudaMalloc(). Sorry if this is a naive question, but should I be doing something different so that I do not get a segmentation fault? The function works just fine when I do not use threads.

In this snippet, I am allocating the memory during initialization so that I can get performance benefits.

uchar3 * img_buffer;
int capture_width = 1280;
int capture_height = 720;
cudaError_t cuda_rtn;

cuda_rtn = cudaMalloc((void**) &img_buffer, (size_t) capture_width * sizeof(uchar3) * capture_height);

Any advice is appreciated, thanks.

Up.

I think this is a stack issue but I’ve checked the stack size and it’s 8 megabytes. I have a bunch of other variables being initialized so it may be possible I’m going over the limit but I would be surprised.

Thanks.

Hi,

Would you mind sharing a simple reproducible source so we can check it directly?

Thanks.

@AastaLLL
Hi,
I can share the source code. Do you need the cmake file too? I’ll make a file with code up to the point where it’s seg faulting.

Thank you!

Hi @AastaLLL ,

I have sent you a message with a project reproducing the result.

Thank you!

Hi,

Confirmed that we have received the source code.
We will try to reproduce this internally and share more information with you later.

Thanks.

Hi @AastaLLL ,

Thank you, sounds good!

Hi,

We have found the issue that causes this segmentation fault.

Please note that you will need to keep the main process alive for accessing GPU.
For example, we can run your sample after adding the join call.

rc = pthread_create(...);

cout << rc << endl;

pthread_join(...);

return 0;

Thanks.

1 Like

Hi @AastaLLL

Sorry for wasting time finding this simple mistake.

I should’ve known pthread_create() won’t wait for thread to finish.

Thank you for investigating.