The only ‘support’ for the 16 bit float type I am aware of in the standard CUDA SDK is in the CUDA math API ‘Type Casting Intrinsic’ library.

There are a few functions which convert unsigned short values to 32 bit float and back.

Would like to be able to perform 16 bit floating multiply, addition and subtraction (or a FMA if possible), and not sure if there is already some existing CUDA functionality.

I think the texture objects have a built in interpolation ability, but I have not found any examples. Can anyone point me to some examples or documentation on this topic.

Already searched and this was the best thing I found so far;

http://www.informit.com/articles/article.aspx?p=2103809&seqNum=3

but wonder if anyone can point to me a code example of half-precision operation in CUDA.