Speed of processing double, float, short

For my GT 650M card. CUDA-Z shows:

  • performance of single-precision Float is 374 Gflop/s performance of double-precision Float is 26 Gflop/s performance of 32-bit Integer is 106 Giop/s performance of 24-bit Integer is 107 Giop/s

If just want to do short add and max calculation, does that mean I should cast them to float to gain the best performance?



As far as I know, CUDA is designed to have high floating-point performance. You should probably test this. I guess it makes sense if you’re using small integer values. Remember that 32-bit floating point values only have 24-bit significand precision.