Flop/s model for vector addition ?

dorra.boughzalayuhqo · July 24, 2019, 4:05pm

Hello everyone,

I’m trying to get the performance (Gigaflop/s of my vector addition), I have already found this:

float  msecPervectAdd= ms / nIter;                                                               
double gigaFlops = (numElements * 1.0e-9f) / (msecPervectAdd/ 1000.0f);

NB:
ms = whole execution time (in ms)
nIter = iterations that use to have longer runs
numElements = the data size of my vectors

But I still want to be sure about it.

Your help is appreciated.
Dorra

Robert_Crovella · July 24, 2019, 4:51pm

Yes, your methodology should give sensible results. By that I mean it should be an accurate measure of the number of floating point operations per second achieved by the code.

However, vector addition is not likely to be compute bound. So you are likely to be effectively measuring memory bandwidth rather than anything related to the actual compute performance of the GPU you are running on.

You may want to understand the analysis methodology described here:

[url]cuda - Nvidia Jetson Tx1 against jetson NANO (Benchmarking) - Stack Overflow

It’s approximately a single aspect of “roofline analysis” to determine the limiting factor in your code. The performance “roofline” for this type of code is actually determined by memory bandwidth of the GPU, not compute performance.

Topic		Replies	Views
GPU (Geforce 8400) is three times _slower_ than CPU while adding vectors. What am I doing wrong? CUDA Programming and Performance	7	15342	January 17, 2010
Performance of addition of two vectors Visual Profiler and nvprof cuda	0	489	April 27, 2020
Slow Performance CUDA Programming and Performance	19	8316	November 24, 2008
GPU speedup query CUDA Programming and Performance	1	5194	July 22, 2010
Speed-up and bandwidth CUDA Programming and Performance	12	9944	May 4, 2008
The elapsed time of adding two vectors of float4 is longer than adding two vectors of float, is it reasonable? CUDA Programming and Performance cuda	3	115	October 29, 2024
Cuda works slower then CPU CUDA Programming and Performance	1	578	November 29, 2019
CPU faster than CUDA CUDA Programming and Performance	2	1899	September 6, 2020
Vector addition on 8600M GT Explaination CUDA Programming and Performance	6	2914	February 4, 2010
slow speed of cuda code CUDA Programming and Performance	4	5332	October 30, 2011

Flop/s model for vector addition ?

Related topics