The result of the product itself is always 0.8x0.7x4 (because k=4), in each output location. That value taking into account all considerations for doing that in fp16 is 2.24023.

At each step, you are summing that value with the sum of the previous iterations. As the sum of the previous iterations gets large (relative to what can be represented in fp16), then the result of the sum of e.g. 8192+2.24023 doesn’t give you 8194.24023 as you might expect.

This problem is due to the limited range of the mantissa/significand in any modern “floating point” number representation. The difference between the largest and smallest number that can be combined will vary based on the accuracy you expect.