Why Multi-GPU slower than single GPUï¼Ÿ

ifly · September 13, 2011, 3:46am

When I train neural network using cublas in Ubuntu 10.10 with CUDA4.0ï¼Œ
I find it strange that using only one GPUï¼ˆonly one GTX590ï¼‰ is faster
than using double GPUsï¼ˆtwo GTX590 in one PCï¼‰with the same configuration.
Why using two GPUs with more cores is beaten by single GPUï¼Ÿ

ARom_nsk · September 13, 2011, 9:37am

When I train neural network using cublas in Ubuntu 10.10 with CUDA4.0ï¼Œ

I find it strange that using only one GPUï¼ˆonly one GTX590ï¼‰ is faster

than using double GPUsï¼ˆtwo GTX590 in one PCï¼‰with the same configuration.
Why using two GPUs with more cores is beaten by single GPUï¼Ÿ

Hi!

It’s depends on your implementation. Could you provide us with your code.

See also http://forums.nvidia.com/index.php?showtopic=197764

ifly · September 14, 2011, 3:32am

Thanks for your reply.My code is very long,but the major part is as following.

/////
void GPU_forward_bunch(size_t frames_this_bunch, QN_MLP_BunchFl3 *mlp)
{
///first layer
cublasSgemm(‘T’,‘N’,frames_this_bunch,mlp->n_hidden,mlp->n_input +1,1.0f,d_input,mlp->n_input +1,d_in2hid, mlp->n_input +1,0.0f,d_hidden,frames_this_bunch);

int grid_size = (frames_this_bunch * (1+mlp->n_hidden))/256 +1;
GPU_sigmoid<<<grid_size , 256>>>(d_hidden ,frames_this_bunch * mlp->n_hidden,frames_this_bunch * (mlp->n_hidden+1));

///second layer
cublasSgemm('N','N',frames_this_bunch,mlp->n_output,mlp->n_hidden +1,1.0f,d_hidden,frames_this_bunch,d_hid2out,mlp->n_hidden +1,0.0f,d_output,frames_this_bunch);

grid_size = frames_this_bunch/256 +1;
GPU_softmax<<<grid_size , 256>>>(d_output,mlp->n_output,frames_this_bunch);

}
////////

Topic		Replies	Views
Multiple GPU speed problem CUDA Programming and Performance	4	1817	November 23, 2009
simpleMultiGPU processing time slower on dual than single? CUDA Programming and Performance	4	2326	November 30, 2008
Caffe on Single-GPU is faster than on Multi-GPU with small batch size CUDA Programming and Performance	0	1660	June 1, 2017
Multiple GPUs CUDA Programming and Performance	2	1710	January 10, 2009
Why 2 GPUs is slower than 1 GPU CUDA Programming and Performance cuda , kernel	6	530	December 4, 2023
Why the following multigpu code works faster when I set GPU_N=1 while it is slower for GPU_N=4? CUDA Programming and Performance cuda	1	677	September 21, 2020
Weird multiGPU performance About 10 times slower than single GPU CUDA Programming and Performance	10	4044	November 25, 2009
programs in two GPU cards CUDA Programming and Performance	0	543	November 1, 2017
Multi GPU initialization time CUDA Programming and Performance	0	603	July 24, 2013
Function is much slower on GPU than on CPU CUDA Programming and Performance cuda	4	630	July 22, 2022

Why Multi-GPU slower than single GPUï¼Ÿ

Related topics