For my GT 650M card. CUDA-Z shows:
- performance of single-precision Float is 374 Gflop/s performance of double-precision Float is 26 Gflop/s performance of 32-bit Integer is 106 Giop/s performance of 24-bit Integer is 107 Giop/s
If just want to do short add and max calculation, does that mean I should cast them to float to gain the best performance?
Regards,
zlf