# Floating point operations by nvprof

Reading the nvprof, I see the following metrics

flop_count_sp 	Number of single-precision floating-point operations executed by non-predicated threads (add, multiply, and multiply-accumulate). Each multiply-accumulate operation contributes 2 to the count. The count does not include special operations. 	Multi-context

flop_count_sp_fma 	Number of single-precision floating-point multiply-accumulate operations executed by non-predicated threads. Each multiply-accumulate operation contributes 1 to the count. 	Multi-context

flop_count_sp_mul 	Number of single-precision floating-point multiply operations executed by non-predicated threads. 	Multi-context

flop_count_sp_special 	Number of single-precision floating-point special operations executed by non-predicated threads.

For a kernel, I get the following numbers

Floating Point Operations(Single Precision)      1150884804
Floating Point Operations(Single Precision Add)      150665905
Floating Point Operations(Single Precision FMA)      428604161
Floating Point Operation(Single Precision Mul)      143010575
Floating Point Operations(Single Precision Special)      23763835

I don’t understand the first metric, Single Precision. What is that exactly? What is accounted there that is not accounted in the rest?

Any thought?

Conversions integer<->floating point, for example. Also floating point comparisons, possibly.

Based on the description, SP counter doesn’t count specials. According to my calculations:

SP = Add + 2*FMA + Mul

However,

left side = 1150884804
right side = 1150884802

That 2 sounds like an error. Still I don’t understand the purpose of that counter.

How does this minute difference (in the ppb range!) matter? The purpose of this synthetic counter is presumably as a convenience function for people who like to talk about application performance in terms of FLOPS. While I consider this questionable and quite meaningless, I still see papers published using such metrics.