flops , hz , cycles , am i missing something in my calculations

kartik3vv · January 13, 2009, 7:02am

i wanted to do a matrix mul on GPU before that plz cear some supercomputing terms t me , or am missing anything

i’ve a xeon E 5470 , dual socket processors , 4 cores on each one @ 3 Ghz

as per an anandtech article , E5472 fares 0.3 flops/cycle on GCC with O3 optimisation

thants makes 3 * 10^9 * 0.3 FLOPS per second , =>9 * 10^8 FLOPs

keeping this in mind , i run a simple matrix multiplication program on CPU

a 1000x1000 matrix :
this would require , 3 * 10^3 * 10^3 (for 3 matrices which r involved in multiplication)
and 2 Floting point instructions per operation as addition and mul is involved
this makes it 2310^6 = 6 * 10^6

total time it must’ve taken is 6*10^6 / 9 * 10^8 = 0.006 sec

but actually its taking 4.4 seconds on gcc with O3 optimisation

Am i missing something Plz Plz help soooooooooooooooonn

thanks in advance

i’ve attached my code

ColinS · January 20, 2009, 12:36am

Hi. Yes, there is something missing. I’m not sure about the Xeon processor specifically, but it takes many processors considerably longer to perform a floating point multiplication than it does to perform an addition. Also, unless you specifically write your code to utilize all four cores of your CPU, only one core is going to get used. Also, the article may have been talking about how many flops can be achieved by utilizing SSE extensions, which are generally four times as fast as programs written without SSE extensions. There are many, many other things which affect CPU performance, which I won’t go into :P

Topic		Replies	Views
Strange FLOP counts CUDA Programming and Performance	21	10244	March 15, 2008
How to get more Gflops ? :) CUDA Programming and Performance	21	27764	September 12, 2008
[Matrix Multiplication] GFlops on Nvidia Quadro FX 1700.... CUDA Programming and Performance	5	7813	April 16, 2010
Confused about GPU vs CPU speed in multiplication CUDA Programming and Performance	8	6624	February 19, 2009
How to compute performance in GFLOPS ? CUDA Programming and Performance	25	12237	November 17, 2008
How to compute the GFLOPS of a program? CUDA Programming and Performance	15	27840	June 24, 2011
Computing GFLOPs CUDA Programming and Performance	1	8150	December 23, 2009
300x to 600x times faster... really? CUDA Programming and Performance	92	34819	February 8, 2010
Matrix Multiplication Throughput CUDA Programming and Performance	2	1061	July 27, 2010
gigaflops CUDA Programming and Performance	16	16541	September 11, 2008

flops , hz , cycles , am i missing something in my calculations

Related topics