Segmentation fault when using cudaMemcpy


Today I encountered a very strange error. Here is the description of my program:

Hardware: Three GTX280, Xeon quad-core, 8Gb Ram.

Software: Three threads running on each card. The program launches 1000 jobs on each thread. Each job copy an image to it’s assigned gpu with a cudaMemcpy.

Observations: The first thread on the cpu works fine. But when the second thread wants to process its cudaMemcpy, most of the time, the application crashes with a segmentation fault error. Here is the gdb trace:

[codebox]Starting program: /data/home/smekens/workspace/sandbox/bin/mx4_prepare2 test.xconf desc

[Thread debugging using libthread_db enabled]

[New Thread 0x7fadd1ade6f0 (LWP 9729)]

[New Thread 0x41925950 (LWP 9730)]

[New Thread 0x42126950 (LWP 9731)]

[New Thread 0x42927950 (LWP 9732)]

[New Thread 0x43128950 (LWP 9733)]

[New Thread 0x43929950 (LWP 9734)]

[New Thread 0x4412a950 (LWP 9735)]

[New Thread 0x4492b950 (LWP 9736)]

Device = 0

Device = 1

Notice : (Kernel2::search) kernel_0 run task = IMG2DESC (0)

Program received signal SIGSEGV, Segmentation fault.

[Switching to Thread 0x42927950 (LWP 9732)]

0x00007fadcfd7c6f5 in ?? () from /usr/lib/

(gdb) bt

#0 0x00007fadcfd7c6f5 in ?? () from /usr/lib/

#1 0x00007fadcfd69a8c in ?? () from /usr/lib/

#2 0x00007fadcfd615b8 in ?? () from /usr/lib/

#3 0x00007fadcfb288d4 in ?? () from /data/home/smekens/local/cuda/lib/

#4 0x00007fadcfb161a7 in cudaMemcpy () from /data/home/smekens/local/cuda/lib/

#5 0x000000000046f059 in pirix_cuda_img2dsc (obsidian=0x5e5e6c0, src=0x7082bf0, det_thresh=200, desc_T=0.00200000009) at cuda/

#6 0x000000000042f03a in pirix_img2dsc (pirix=0x2f6d520, src=0x7082bf0) at pirix.c:188

#7 0x000000000040b21b in mx4::Kernel2::runTask (this=0x2f68c20, type=, ptr=) at core/kernel2.cpp:183

#8 0x000000000040e1e0 in kernelMainThread (t=0x2f68c20) at core/dispatcher.cpp:24

#9 0x00007fadd01e4fc7 in start_thread () from /lib/

#10 0x00007fadcec2c5ad in clone () from /lib/

#11 0x0000000000000000 in ?? ()



Should I set a mutex arount the cudaMemcpy to prevent this crach to happend ???

Have I forgotten to to something importent when I am using mutlithread application ???

I hope this problem will be fixed soon.