Half Float and Fermi

Hi folks, I’m currently investigating half float to optimize memory consumption in my application.
However I’m afraid Fermi could be significantly slower for kernels managing with half floats.
The current trend seems more to be moving to double precision rather than float 16.
So, should I go for it?

A fact that will never change is that a float16 is half the size of a float32…

So if your kernel is bandwidth-bound, moving float16s to memory instead of float32s will provide a significant improvement, regardless of whether the conversion is done in hardware or software.
Also, you can fit more numbers inside the caches, which is a good thing…

About Fermi, David Patterson’s whitepaper mentions “a 16‐bit floating‐point memory format”.
Anyway, float16 (aka Binary16) is now a fully-featured IEEE-754 format, as much as Binary32 and Binary64, and the flops/memory ratio of GPUs keeps increasing. I personally think it would not make any sense to drop float16 support now…