Are there any 16 bit float datatypes / calculations available in CUDA? It would be a small convenience for me to have them.
I’m currently using short2’s for some coordinates. The thing is that I’d like to have some stuff after the dot: x: 0.25, y:234.43. Accuracy isn’t that important. Right now I’m simply multiplying/dividing by 64 (or shifting by 6) into and out of global mem. It feels a little like using a shoehorn - so are there 16 bit floats anywhere?
The hardware should be able to read 16-bit floats from textures just fine, and possibly from gmem. HDR graphics makes use of them extensively, since 32-bit floats are overkill.
half-precision fp is not expose in CUDA as a type. You can store 16-bit floats in a texture using driver API to save space, however fetches will return 32-bit floats to the kernel.
I suspect that in my program, I’d benefit from having an ugly struct-type 16-bit float and a function to massage 32 bit floats into and out of it. It’d halve the needed memory bandwidth and not much affect the kernel instruction costs.
Consider this my request for a CUDA feature, at line three thousand million or so in the TODO list.