I have some matrixes of 1’s and 0’s that I want to multiply together and get a perfectly accurate result. The elements in the product won’t exceed the mantissa, i.e. we don’t have to worry about overflow in that sense. I don’t have any sort of special determinism turned on, things are generally non-deterministic and set for speed. Is my result matrix guaranteed to be accurate, i.e. having the correct integers as 32-bit floats?
I can’t offer any guarantees, but as long as no intermediate product or sum value exceeds the largest integer that can be safely stored in a float quantity, you should have predictable results.
btw cublas offers 8-bit integer multiplications into a 32-bit integer result.
The 32-bit float path will not take advantage of Tensor Core. The 8-bit integer path can take advantage of tensor core. Using TF32 might be another option.