GPU is Slower than CPU!

Homebody · January 16, 2009, 11:54pm

Hello,

I have 1000 equations like the following one made in CPU by C++:

[codebox] h_F[j]=(delta[j]-hslip[j]) - pre_delta[j] - hpre_slip[j];[/codebox]

I made these equations on host and trensferred the vector h_F to GPU at each time-step. But, to reduce the CPU-GPU communication traffic, I want to make these 1000 equations completely in GPU by using CUBLAS. I wrote the following code to make the above equation:

[codebox] cublasScopy(N_detail, d_delta, 1, d_F1, 1);

cublasSaxpy(N_detail, -1, d_pre_delta, 1, d_F1, 1);//d_F1=delta-delta_pre

cublasScopy(N_detail, d_slip, 1, d_RESULT1, 1);

cublasSaxpy(N_detail, 1, d_pre_slip, 1, d_RESULT1, 1);//d_RESULT1=slip+slip_pre

cublasSaxpy(N_detail, -h, d_RESULT1, 1, d_F1, 1);[/codebox]

However, this code on GPU takes significantly more time to run in compare with its twin on the CPU!! I have a large set of data, so I expected that this code should be run faster than CPU. My question is why this takes more time than CPU? Do you have any suggestion to make this code on GPU more efficient?

I really need to reduce the computation time as much as possible, please let me know your ideas or suggestions.

Thanks.

SPWorley · January 17, 2009, 1:41am

The quick answer is that your problem is dominated by host-device memory transfers, not by math.
The overhead of merely sending the data to the GPU is more than the time the CPU takes to do the compute.

GPU computes win best when you have multiple, complex, math operations to perform on data, ideally leaving all the data on the device and not sending much back and forth to the CPU.

You may have a lot of data, but you don’t have much work for the GPU to do to that data.

Topic		Replies	Views
Program without CUDA is faster CUDA Programming and Performance	6	10524	December 19, 2008
Performance in basic algorithm Why isn't faster? CUDA Programming and Performance	4	1711	January 9, 2009
CUBLAS VS CUDA Kernel CUDA Programming and Performance	2	6847	August 15, 2007
cuda gpu slower than cpu CUDA Programming and Performance	2	1110	May 1, 2012
transpose demo: gpu vs cpu CUDA Programming and Performance	3	9455	August 9, 2007
CUBLAS performance issues CUDA Programming and Performance	3	2705	March 21, 2008
A few questions on CUDA performance with pictures! CUDA Programming and Performance	6	3406	January 10, 2009
cublas routine and cudamemcpy speed GPU->CPU transfer speed affected by cublasDsyr call?! CUDA Programming and Performance	2	9306	October 27, 2009
In SDK project the GPU function takes more time than CPU function CUDA Programming and Performance	8	2067	August 17, 2009
Communication Delay Factors! What is the significant factor? CUDA Programming and Performance	3	6924	July 26, 2007

GPU is Slower than CPU!

Related topics