I am examining future possibilities of moving certain types of game related simulation code that are integer based to the GPU currently and I was looking at integer performance across a range of nvidia GPU’s.
From my googling integer performance seems to still lag behind float32 performance significantly despite some sources claiming that these days they are equal (which I have yet to find proof of). I ran the CUDA-Z test across a range of GPU’s and got the following relevant results:
GTX1080 GTX980Ti GTX970 GTX680
float32 GFLOP/S 8450 7285 3387 2525
int64 GIOP/s 606 489 248 142
int32 GIOP/s 2840 2457 1005 567
Currently I am not sure how good a benchmark Cuda-Z is but the results don’t surprise me.
Some of what I am looking to move to the GPU uses 64bit integers on the CPU as they benefit greatly from wide integers. It appears that even on the GTX1080 64bit integers are emulated with 32bit ones and that I would probably be better off manually re-organising around more complicated 32bit integer implementations for more control?
I imagine this is a bit of a chicken and egg situation going forward though. As developers won’t use 64bit integers on GPU’s due to performance problems, so nvidia won’t improve 64bit integer performance as it’s not being targeted by anyone. I imagine it’s only people like crypto researchers and the like that do currently and that they are a very small minority influence in the grand scheme of things.
Personally I would really like to push things so we had stellar 64 and 32 bit integer performance with intrinsics that encourage 64bit wide SWAR techniques etc.
The trend of improvements regards float16 that seem to be being pushed through currently from neural network applications and will be useful in the gaming sector are great, but it seems that integer performance will continue to be left by the way side for a long time yet?