OMP slow down CUDA???

lrdikysa · May 25, 2019, 12:06pm

Hello. I have mvapich 2.3 (with --enable-cuda), cuda 9.0, GTX-680

My code:

***

double time = MPI_Wtime();
cudaMemcpy(*****)
cout << " time = " << MPI_Wtime() - time << endl;

I compilate by nvcc + mpicxx with flags -fopenmp -O3, i set enviroment variables: OMP_NUM_THREADS = K, MV2_USE_CUDA 1.

Results:
K = 1 => have time(1)
K = 2 => have time(2) > time(1)
…
K = 8 => have time(8) > time(7) > … > time(1)

I also got a similar result when using cudaMemcpyAsync

Why is this happening? How to avoid it?

tera · May 27, 2019, 10:28am

That might depend on ***, *****, the hardware you are running this on, and how you launch it / what else is running on the node.

Topic		Replies	Views
Cuda + omp = big slowdown CUDA Programming and Performance	4	1348	August 20, 2013
OpenMP Multi-GPU, not getting speedup expected CUDA Programming and Performance	5	5905	July 15, 2011
very slow function next to kernel CUDA Programming and Performance	3	3976	August 10, 2008
more touch, more time CUDA Programming and Performance	9	2054	April 23, 2010
Calling CUDA function disables OpenMP? Can they co-exist in the same application? CUDA Programming and Performance	2	4547	June 7, 2010
openMP faster than GPU? CUDA Programming and Performance	2	2064	June 15, 2012
Question: time counting with/without memcpy CUDA Programming and Performance	2	1771	August 30, 2008
GPU and CPU don't run in (pure) parallel ? CUDA Programming and Performance	24	20317	May 4, 2007
Slow Down a little later CUDA Programming and Performance	4	5304	July 30, 2007
About CUDA CUDA Programming and Performance	2	4756	December 3, 2008