How to compute GFlops for GEMM BLAS

Hi All,
What is the formula for computing GFLOPS for GEMM ? I have used following formulas please
give your feedback.

DGEMM and SGEMM = (2MNK) (timeInSec)/ (1024^3) // factor 2 : 1 mult + 1 addition
CGEMM and ZGEMM = (6
MNK) (timeInSec)/ (1024^3) // factor 6 : 4 (complex mult) + 2 (complex addition)

Regards,
Sachin

The standard BLAS gemm operation is

C <- alpha * AB + beta*C

so off the top of my head, the total flop count for the scalar version should be M(2NK) + MN + 2MN = MN(2K+3). This means the throughput of the operation should be computed as

MN(2K+3) / (1000^3 * time in second)

for the result to be in Giga flop per second. So by my reckoning, your formulas are incorrect in several places.

EDITED for mixing up K and N in the orginal post.

(a+bi)(c+di)=ac-bd+(ad+b*d)i

Your point being? I posted what I believe to be the correct operations count for the scalar sgemm/dgemm case. Is there an error?

I am sorry, I wanted to reply to thread author.

Let do detail analysis for both scalar and complex inputs.

Count Total Operations

C <- alpha * AB + beta*C

Operations(AB) = MNK (mult) + MN(K-1) (add)

Operation(alpha*AB) = MNK (mult) + MN(K-1) (add) + MN (mult)

Operation(alphaAB + betaC) = MNK (mult) + MN(K-1) (add) + MN (mult) + MN (mult) + MN (add)

Total = MN(K+2) (mult) + MNK (add)

GFLOPS for DGEMM and SGEMM

Total Operations = MN(2K+2) GLOPS = (MN(2K+2) / (1000^3 * (timeInSec))

GFLOPS for CGEMM and ZGEMM

Total Operations = MN(K+2)6 + MNK2 = 8MNK + 18MN // 6 (complex mult) + 2 (complex addition)

GFLOPS = (8MNK + 12MN) / (1000^3 * (timeInSec))

Waiting for your feedback.