Thanks a lot.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Bitslice-DES optimization | 55 | 12597 | January 29, 2022 | |
Used Registers vs Live Registers | 14 | 3381 | June 28, 2020 | |
A more accurate, performance-competitive implementation of expf() | 24 | 8181 | November 19, 2017 | |
Division problem (weird behavior) | 23 | 18033 | November 15, 2010 | |
On the register allocation optimization of cuda compiler | 12 | 3242 | January 20, 2019 | |
Blackwell Integer | 101 | 1491 | April 11, 2025 | |
Faster and more accurate implementation of log1pf() | 15 | 3288 | January 25, 2017 | |
Optimized version of single-precision error function, erff() | 21 | 4452 | December 25, 2017 | |
On the utility of SFU instructions for half-precision math functions | 8 | 2425 | September 16, 2019 | |
Using fast_math used to be much faster on arch 2.0 and 3.0 but is now even slower on arch 3.5 and up ! | 19 | 2212 | October 31, 2016 |