 # How to compute GFlops for GEMM BLAS

Hi All,
What is the formula for computing GFLOPS for GEMM ? I have used following formulas please

DGEMM and SGEMM = (2MNK) (timeInSec)/ (1024^3) // factor 2 : 1 mult + 1 addition
CGEMM and ZGEMM = (6
MNK) (timeInSec)/ (1024^3) // factor 6 : 4 (complex mult) + 2 (complex addition)

Regards,
Sachin

The standard BLAS gemm operation is

``````C <- alpha * AB + beta*C
``````

so off the top of my head, the total flop count for the scalar version should be M(2NK) + MN + 2MN = MN(2K+3). This means the throughput of the operation should be computed as

``````MN(2K+3) / (1000^3 * time in second)
``````

for the result to be in Giga flop per second. So by my reckoning, your formulas are incorrect in several places.

EDITED for mixing up K and N in the orginal post.

Your point being? I posted what I believe to be the correct operations count for the scalar sgemm/dgemm case. Is there an error?

Let do detail analysis for both scalar and complex inputs.

Count Total Operations

C <- alpha * AB + beta*C

Operations(AB) = MNK (mult) + MN(K-1) (add)

Operation(alpha*AB) = MNK (mult) + MN(K-1) (add) + MN (mult)

Operation(alphaAB + betaC) = MNK (mult) + MN(K-1) (add) + MN (mult) + MN (mult) + MN (add)

Total = MN(K+2) (mult) + MNK (add)

GFLOPS for DGEMM and SGEMM

Total Operations = MN(2K+2) GLOPS = (MN(2K+2) / (1000^3 * (timeInSec))

GFLOPS for CGEMM and ZGEMM

Total Operations = MN(K+2)6 + MNK2 = 8MNK + 18MN // 6 (complex mult) + 2 (complex addition)

GFLOPS = (8MNK + 12MN) / (1000^3 * (timeInSec))