By using the fast math compiler option what exactly the are changes the compiler makes to the code…
- some math functions are replaced by their faster device functions. But device function have lesser accuracy, are they 24 bit ?
- when using fast math in some of my kernels, i see a drop in the register usage ?
- does using fast math, convert the all the ops in to 24 bit ops ?
Thanks…
use_fast_math basically takes all of the math functions (like sinf) and replaces them with their hardware intrinsic counterparts (like __sinf). The amount of error introduced differs from function to function - see the programming guide for the full list of error tolerances and what input ranges have what error tolerances. Functions like sinf perform argument reduction steps while those __sinf do not - that explains the reduced register usage.
use_fast_math basically takes all of the math functions (like sinf) and replaces them with their hardware intrinsic counterparts (like __sinf). The amount of error introduced differs from function to function - see the programming guide for the full list of error tolerances and what input ranges have what error tolerances. Functions like sinf perform argument reduction steps while those __sinf do not - that explains the reduced register usage.