How to use RT_FORMAT_HALF*

yashiz · January 5, 2018, 10:46am

Hi,

What should be the format for rtBuffer<> if the buffer is created with RT_FORMAT_HALF4? Do I need cuda_fp16.h to use it? You know, for RT_FORMAT_FLOAT4, rtBuffer is supposed to be used, accessing and operating with float4 is straightforward.

As there isn’t any optix examples nor documentations about RT_FORMAT_HALF*, I would like to get help from here.

Thank you very much,

Yashiz

droettger · January 8, 2018, 9:21am

Depends on what you’re trying to do and on which GPU.

If you’re planning to use half precision calculations in your kernel, you would need to be careful on which GPUs you’re actually working to gain a performance benefit. I would not recommend to use half float calculations inside the OptiX kernels if you target all OptiX suported GPUs.

Using half precision input/output buffers is straightforward and only needs the two CUDA intrinsic functions __half2float() and __float2half_rn() to convert to and from 32-bit float values. These exist as overloads for unsigned short and __half formats which are both 16-bit in size.

Here is an example of using the unsigned short methods which are implemented inside the CUDA device_functions.hpp which should be present in OptiX. No need to include cuda_fp16.h then.

rtBuffer<ushort4, 2> rgba16f; // e.g. RGBA16F input-output buffer.

// Load half values contained inside the ushort bits.
const ushort4 us4 = rgba16f[theLaunchIndex];

// Convert the half values to float data 
float r = __half2float(us4.x);
float g = __half2float(us4.y);
float b = __half2float(us4.z);
float a = __half2float(us4.w);

// ... Do something with the 32-bit floating point rgba data.

// Convert float values into half values with rounding and store them as RGBA16F into the ushort bits. 
rgba16f[theLaunchIndex] = make_ushort4(__float2half_rn(r), __float2half_rn(g), __float2half_rn(b), __float2half_rn(a));

Using RGBA16F data is going to reduce the bandwidth required to transfer such image data.
If you want to upload such buffer data to an OpenGL texture, that happens the same way as with RGBA32F data.

Here’s some example code from my ray tracers which can switch the buffer format between RGBA32F and RGBA16F at compile time with the USE_FLOAT_32_BIT_OUTPUT define with OpenGL interoperability using pixel-bufferobjects or without:

if (m_interop) 
{
  glBindBuffer(GL_PIXEL_UNPACK_BUFFER, m_bufferOutput->getGLBOId());
#if USE_FLOAT_32_BIT_OUTPUT
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, (GLsizei) m_width, (GLsizei) m_height, 0, GL_RGBA, GL_FLOAT, (void*) 0); // RGBA32F data from PBO byte offset 0.
#else
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16F, (GLsizei) m_width, (GLsizei) m_height, 0, GL_RGBA, GL_HALF_FLOAT, (void*) 0); // RGBA16F data from PBO byte offset 0.
#endif
  glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
}
else
{
  const void* data = m_bufferOutput->map();
#if USE_FLOAT_32_BIT_OUTPUT
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, (GLsizei) m_width, (GLsizei) m_height, 0, GL_RGBA, GL_FLOAT, data); // RGBA32F
#else
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16F, (GLsizei) m_width, (GLsizei) m_height, 0, GL_RGBA, GL_HALF_FLOAT, data); // RGBA16F
#endif
  m_bufferOutput->unmap();
}

yashiz · January 8, 2018, 10:36am

Thank you Detlef. Very helpful and clear.

Cheers :)

Topic		Replies	Views
Can i use half float (16 bit) in optix? OptiX	3	824	June 14, 2022
error when trying to use half (fp16) CUDA Programming and Performance	16	20054	October 13, 2015
Test of new 16 bit float half type in CUDA 7.5 CUDA Programming and Performance	12	5306	June 7, 2016
display a buffer openGL/cuda question CUDA Programming and Performance	11	8175	May 13, 2008
Half float RGBA texture CUDA Programming and Performance	0	1073	June 23, 2012
How do I read OpenGL half float? CUDA Programming and Performance	3	1795	May 29, 2016
rtBuffer - indexing OptiX	14	1593	June 14, 2022
Host access to Buffer OptiX	9	2395	June 14, 2022
optixDenoiserInvoke pixel format OptiX	4	1011	June 14, 2022
Half precision cuFFT Transforms GPU-Accelerated Libraries	12	6089	March 29, 2021

How to use RT_FORMAT_HALF*

Related topics