My first test on CUDA and some questions sync, thread with CUDA

soloman · November 13, 2007, 3:02pm

Hi, I’m new to CUDA, and I design a test of mine today, to test if it can ran multiple threads or processes to share only one GPU, but while the threads number is up to 8, it seems very unstable, I mean 8 threads share only one GPU device. Sometimes the copy of memory from device back to host failed, sometimes it hung my computer!

this problem also happened when I ran multiple processes which use CUDA to do some calculations, of course, I have only one GPU card!

[b]

I wrote my test in one doc word file, which described my test design and the problem output. Please read it if you have time and willing to help me :-) Thanks a lot! [attachment=4722:attachment][/b]

Anybody can provide some suggestions on these usage?

My system is notebook with 8600GM, 256MB video memory, Redhat Enterprise Linux 5.
CUDATest.doc (208 KB)

soloman · November 13, 2007, 3:03pm

…

seb · November 13, 2007, 3:21pm

To make sure the global function (e.g. the kernel) finished you can call CudaThreadSyncrhonize() on the host. This function will return when the device is finished with all computations.

When using one device in a multi threaded program you have to be careful if you want to share device memory references over multiple threads. It has been reported that this doesn’t work so the best way to handle this is to dedicate a single thread to handle all GPU related functions.

I have no experience with multiprocess GPU usage - someone else might be better suited to answer those questions.

soloman · November 13, 2007, 3:26pm

thanks very much, but I still got problems when running multiple processes! For example, I have a program named syncTest, which uses cuda to do calculation, and then I wrote a script, like:

> output

for ((i=1;i<=5;i++));do

  ./syncTest >> output 2>&1 &

done

very strange, if the number of processes increase, the system became unstable, the CPU is very busy, and sometimes, the result is wrong!

well, if possible, I could provide the code of mine.

soloman · November 13, 2007, 3:55pm

Conclusion and Question

From this test, I known:

all calls from host to device is async, except the memory copy functions, this makes it possible for parallel computing;
memory available on the video card is very important, you should check it, or change the algorithm which will use CUDA, to avoid huge or un-managed usage of memory;
It’s better to feed GPU jobs in line, not parallel;

Still some questions:

Is there a way to known how many memory is available on the device?
Is there a way to only check if the device job is finished, without a blocking?
Why cannot multiple processes which use GPU CUDA run at the same time? If not, how to avoid that?
Why it costs a lot of CPU time while running multiple processes which use GPU CUDA?
Is there a better way to run CUDA code in multiple thread to share one GPU? Or GPU can be only used one task by another?

AndreiB · November 13, 2007, 5:33pm

GPU cannot run several kernels in parallel, they are serialized before starting.

Answers:

Yes. Programming Manual holds contains the answer.
No.
They can, at least they should.
This is probably a limitation of CUDA 1.0. Some syncronization issues.

Topic		Replies	Views
Multiple GPU computing CUDA Programming and Performance	8	7881	May 7, 2008
Failure with independent devices on independent processes Try it yourself! CUDA Programming and Performance	19	3464	March 10, 2011
multi-GPU parallel operation CUDA Programming and Performance	4	4031	May 1, 2008
Multiple kernels in flight? CUDA Programming and Performance	19	26842	August 28, 2007
Simple multiGPU - Why is it failed Example to understand how multiGPU work CUDA Programming and Performance	8	4344	March 6, 2008
Problematic multi GPU execution CUDA Programming and Performance	6	1986	June 12, 2012
CUDA processor allocation CUDA Programming and Performance	7	3437	October 5, 2007
Using CUDA to run many instances CUDA Programming and Performance	10	3394	April 1, 2012
Mapping between CUDA cores and threads CUDA Programming and Performance	7	15404	December 2, 2011
GPU-CPU & GPU-GPU synchronization query on advanced CUDA features CUDA Programming and Performance	12	17418	June 14, 2008

My first test on CUDA and some questions sync, thread with CUDA

Related topics