Hi,
I’ve noticed in the Cuda programming guide that it is possible to do texture reads with 8 / 16 bit (un)signed integers and automatically convert said numbers in 32 bit floating point numbers in the range 0…1 and -1…1 for unsigned and singed integers respectively. This seems like a really useful read mode for my application, but only if one can also do the converse, i.e., take a 32 bit floating point number in this range and write out an integer. From looking at the Cuda guide it would appear that this isn’t currently supported, but does the underlying hardware support this type of conversion?
For those that are interested, I am reading and writing unitary matrices, i.e., the sum of the squares of the rows and columns all add up to one, and I am bandwidth bound. Since the size of my elements are guaranteed to lie in this range bandwidth is wasted reading and writing the exponents. Switching to reading and writing 16 bit integers instead of 32 bit floats would nicely boost my performance at a small reduction in accuracy (22 bit mantissa → 15 bit mantissa).
Cheers,
Mike.