read half2 directly from 2D texture

TroyK · May 22, 2018, 7:59pm

Is there a way to read half2 values directly from a 2D texture with half precision data?

I am converting CUDA code to use half precision and would like to use:

half2 val = tex2D(tex, x, y);

instead of the current:

float2 val = tex2D(tex, x, y);

The version with does not compile. The documentation seems to indicate that this is not possible, but I wanted to make sure I’m not missing something. I would like to avoid the conversion to FP32 and then convert back to FP16.

The compile error:
/usr/local/cuda/include/texture_indirect_functions.h(262): error: no instance of overloaded function “tex2D” matches the argument list
argument types are: (half2 *, cudaTextureObject_t, float, float)
detected during:
instantiation of “T tex2D(cudaTextureObject_t, float, float) [with T=half2]”

Thanks, Troy.

cbuchner1 · May 23, 2018, 8:41am

the texture interpolator (if you intend to use it) would always return float/float2/float4 .

You could try cudaReadModeElementType for point sampling, and maybe it’s possible to obtain the half2 as ushort2 (earlier versions of CUDA certainly supported it). Then apply the following casting function to the ushort2 vector components. It will not actually perform any costly conversion but simply reinterpret the ushort value as a half float.

device __half __ushort_as_half ( const unsigned short int i )

Christian

HannesF99 · May 23, 2018, 11:53am

See also
https://devtalk.nvidia.com/default/topic/774359/cuda-programming-and-performance/how-create-channel-format-for-2-d-texture-binding-to-half-float-image-/

The binary type should be ushort or ushort2

Note you can derive from the ‘__half’ struct, in order to gain access to the protected member variable ‘__x’ (I hope does not get changed in future Cuda Toolkits)

TroyK · May 23, 2018, 6:20pm

Thank you Christian and HannesF99 for the information. Unfortunately I do need to use interpolation during the texture read so the ushort approaches will not help.

It appears that reading out a half2 from a texture with interpolation is not currently supported. The CUDA documentation could use some improvement in this area, and with half precision in general.

Troy.

HannesF99 · May 24, 2018, 8:09am

you can easily implement the bilinear interpolation (and also the border mode handling) by yourself instead of relying on the texture hardware. Unless your kernel is really compute bound (which is usually not the case), the few additional arithmetic operation will not really affect the kernel runtime negatively.

njuffa · May 24, 2018, 3:13pm

Based on my experience with various use cases, I will boldly claim that this should be the default approach these days. On modern GPUs, FP32 floating-point operations are “too cheap to meter”, and the quality of the interpolation is much better when done with FP32 versus the 9-bit (that is, 1.8) fixed-point arithmetic utilized by the texture units.

Only where this approach is not fast enough AND the quality degradation from use of hardware interpolation is acceptable should interpolation via the texture units be chosen.

TroyK · June 4, 2018, 3:27pm

To follow up on my original question and the suggestions made, I implemented the linear interpolation in device code using half precision intrinsics to replace the 2D texture read, but I observed significantly longer kernel runtime. The interpolation is actually very simple, a subtraction and a fused multiply add. However, properly handling border/edge conditions makes the algorithm more complicated. And, in my case, the input data does not fit in constant memory or shared memory, forcing the interpolation to read from global memory. So it appears that in my case, using half2 val = __float22half2_rn(tex2D(tex, x, y)) is the best solution for interpolating half2 data.

Troy.

Topic		Replies	Views
Texture interpolation of packed fp16 (half2) CUDA Programming and Performance	2	811	June 20, 2022
Are half-precision fp vector textures broken? CUDA Programming and Performance	7	9842	January 19, 2010
What about half-float? CUDA Programming and Performance	18	29690	October 26, 2017
How do I read OpenGL half float? CUDA Programming and Performance	3	1908	May 29, 2016
'half' datatype - IEEE 754 conformance CUDA Programming and Performance	23	11357	March 10, 2017
Linear interpolation with integer texture. CUDA Programming and Performance	6	2913	August 12, 2022
Half float RGBA texture CUDA Programming and Performance	0	1105	June 23, 2012
about FP16 and texture CUDA Programming and Performance	1	925	May 9, 2017
How create channel format for 2-D texture binding to 'half-float' image ? CUDA Programming and Performance	6	2522	January 23, 2018
Test of new 16 bit float half type in CUDA 7.5 CUDA Programming and Performance	12	5491	June 7, 2016

read half2 directly from 2D texture

Related topics