There is a lengthy discussion about blackwell integer performance here:
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Blackwell Integer | 158 | 3551 | August 22, 2025 | |
| Peak Performance INT1, INT4, INT8, INT16, INT32 for RTX3090 Tensorcore | 0 | 1214 | January 12, 2021 | |
| Calculating TOPS and TFLOPS in H100 | 7 | 1867 | August 2, 2024 | |
| throughput of integer add | 17 | 3115 | August 15, 2011 | |
| Mythical Tflops | 11 | 1169 | January 14, 2019 | |
| Integer addition throughput benchmark | 6 | 1787 | September 22, 2014 | |
| Peak Performance of integer operation | 3 | 2898 | May 11, 2017 | |
| Cuda 3.5 Integer Multiply Performance Is it really 3x slower than 64-bit floating point? | 21 | 20007 | March 12, 2014 | |
| Throughput for certain integer arithmetic instructions. | 5 | 1797 | January 15, 2020 | |
| Integer addition throughput benchmark | 6 | 1422 | September 21, 2014 |