Write support for cudaReadModeNormalizedFloat ?


I’ve noticed in the Cuda programming guide that it is possible to do texture reads with 8 / 16 bit (un)signed integers and automatically convert said numbers in 32 bit floating point numbers in the range 0…1 and -1…1 for unsigned and singed integers respectively. This seems like a really useful read mode for my application, but only if one can also do the converse, i.e., take a 32 bit floating point number in this range and write out an integer. From looking at the Cuda guide it would appear that this isn’t currently supported, but does the underlying hardware support this type of conversion?

For those that are interested, I am reading and writing unitary matrices, i.e., the sum of the squares of the rows and columns all add up to one, and I am bandwidth bound. Since the size of my elements are guaranteed to lie in this range bandwidth is wasted reading and writing the exponents. Switching to reading and writing 16 bit integers instead of 32 bit floats would nicely boost my performance at a small reduction in accuracy (22 bit mantissa -> 15 bit mantissa).



Well, you can’t write directly to cudaArray memory anyway so the point is moot. Just perform the additions, divisions and type casts you need to do the conversion yourself and write the value out to global memory, they aren’t expensive. Coalescing would be a little tricky with 16-bit values, though. If you need to get those values back into a cudaArray for reading from the texture again, you need to cudaMemcpyToArray this global memory region to your array.