The only ‘support’ for the 16 bit float type I am aware of in the standard CUDA SDK is in the CUDA math API ‘Type Casting Intrinsic’ library.
There are a few functions which convert unsigned short values to 32 bit float and back.
Would like to be able to perform 16 bit floating multiply, addition and subtraction (or a FMA if possible), and not sure if there is already some existing CUDA functionality.
I think the texture objects have a built in interpolation ability, but I have not found any examples. Can anyone point me to some examples or documentation on this topic.
Already searched and this was the best thing I found so far;
but wonder if anyone can point to me a code example of half-precision operation in CUDA.