Hello,
I’m wondering why the flop_count_xx metrics also have separate min, max, average values using nvprof. The number of floating-point operations should be the same each time the kernel is called no? I can understand the kernel runtime may varies along the calls but I don’t see why the number of operations is varying?
Could you please clarify this?
Br,