Optix denoiser output buffer data range

I’m using the Optix denoiser in our OpenGL render application, as the last step after ray-tracing. The data format we have is 8-bit unsigned char. As stated in Optix 7.3 release note, this format isn’t directly supported by the denoiser, so I’m manually converting the image to float by dividing 255. So the input data range is 0.0 to 1.0.

However on retrieving the output buffer, I realize the data range has been enlarged to beyond 1.0. For some test image, the range seems to be 0.0 to 2.0. I need to convert this date back to unsigned char. If there is a way to find out the min and max of the output buffer, then I can simply do this cast the output buffer like this: (unsigned char)(255 * (image_pixels[i]-min)/(max-min)).

Can you instruct me how to find out the output buffer’s expected data range? Or is there another way to do this conversion from float to unsigned char on the output buffer?

Thanks very much!

Can you instruct me how to find out the output buffer’s expected data range?

There is no such functionality to calculate the minimum or maximum value inside a color buffer inside the OptiX denoiser.
It has entry point functions calculating the HDR intensity and average color only, which are needed to produce better results especially for very dark or very bright inputs.

Which denoiser mode did you use? (LDR, HDR, AOV)
https://raytracing-docs.nvidia.com/optix7/guide/index.html#ai_denoiser#nvidia-ai-denoiser

As stated in Optix 7.3 release note, this format isn’t directly supported by the denoiser, so I’m manually converting the image to float by dividing 255

Right. You mean float 32-bit? It’s recommended to use half 16-bit instead for better performance.

Please always provide the following system configuration information when asking about OptiX issues:
OS version, installed GPU(s), VRAM amount, display driver version, OptiX (major.minor.micro) version, CUDA toolkit version (major.minor) used to generate the input PTX, host compiler version.

OS version: Windows 10
GPU: NVIDIA Quadro P4000
Driver: 466.11
CUDA: 11.3
Optix: 7.3

Thanks for your answers. I’m so sorry the initial question I asked was actually caused by my mistakes. The input inputs I passed to Optix denoiser were not in 0 to 1 range. Therefore the output was out of the range as well. I only found that out through debugging. But your answer did help me for at the time I didn’t even realize I was using HDR mode when my data is LDR.

Now I’ve fixed that problem by using LDR denoiser mode and make sure input data inside 0 to 1 range. I’ve also fixed the data type problem I had. Now we are using floats as input and output. I will look into the half 16-bit option as you suggested.

I just ran into a few other questions I hope you can help me again:

  1. In LDR mode, does the input noisy image (the beauty layer) needs to be gamma corrected? How about the corresponding albedo image. Does it need to be gamma corrected as well? It would be better for our code if gamma correction can happen after denoising but if doing it the other way is better for denoising then I’ll make the effort to change that.

  2. For optixDenoiserComputeMemoryResources() method, I’m only passing in the width and height of the framebuffer. It doesn’t seem to care about the 3rd dimension (i.e. color channels)? In our case, the color channels are 4 (RBGA). So size of the beauty/normal/albedo layer are all calculated as: width * height * 4 * sizeof(float), while this method only takes in width and height. I just want to double check if I’m using it correctly:
    memset(&_sizesDenoiser, 0, sizeof(OptixDenoiserSizes));
    OPTIX_CHECK(_api.optixDenoiserComputeMemoryResources(_denoiser, _width, _height, &_sizesDenoiser));

  3. If I choose to use CUDA driver api instead of CUDA runtime api, does that mean our customer will not need to install the CUDA 11.3 toolkit? I understand I still need to install it in my development environment but in production can they go without the toolkit installation required? They only need to have latest GPU driver run Optix denoiser if I use CUDA driver API. Correct?

Please read the OptiX Programming Guide chapter I linked to above again. Some of your questions are already answered in that. I will only answer the remaining ones.

How about the corresponding albedo image. Does it need to be gamma corrected as well?

That’s a good questions. It’s not mentioned inside the programming guide. That should always be in linear space.

It would be better for our code if gamma correction can happen after denoising but if doing it the other way is better for denoising then I’ll make the effort to change that.

Then do not use the LDR denoiser, use the HDR denoiser which works in linear color space. There actually isn’t a separate AI network for the LDR denoiser inside OptiX versions for quite some time anyway.
Please have a look into other posts about the OptiX denoiser here:
https://forums.developer.nvidia.com/search?q=Denoiser

  1. For optixDenoiserComputeMemoryResources() method, I’m only passing in the width and height of the framebuffer. It doesn’t seem to care about the 3rd dimension (i.e. color channels)?

Right, that calculates internal scratch memory sizes which depend on the model and options which are part of the denoiser handle created with optixDenoiserCreate and the input dimensions.
Please follow the examples inside the programming guide
https://raytracing-docs.nvidia.com/optix7/guide/index.html#ai_denoiser#allocating-denoiser-memory
and the optixDenoiser SDK example or the open-source examples you find in the sticky posts of this sub-forum.

  1. If I choose to use CUDA driver api instead of CUDA runtime api, does that mean our customer will not need to install the CUDA 11.3 toolkit?

Running OptiX applications does not require the installation of the CUDA Toolkit on the target system!
That is mentioned inside all OptiX SDK Release Notes.

If you’re using the CUDA runtime API and link dynamically you need to ship the resp. CUDA version’s runtime DLL with your application. If you’re linking it statically it’s part of the executable.
If you use the CUDA driver API you always link dynamically and the DLL ships with the display driver.

You would only need a CUDA development environment (the CUDA toolkit) and host compiler on the target system if you do anything with the CUDA Compiler (NVCC) at runtime, like generating CUDA program code in some material editor and compiling it to PTX input source for OptiX, but that in turn can be handled with the CUDA Runtime Compiler NVRTC which can be shipped as two DLLs (compiler and built-ins). Though that would need the CUDA headers, which implies a CUDA toolkit installation on the target system when compiling things at runtime.

Again, no CUDA Toolkit required on the target system whatsoever. OptiX 7 is a header only API and the implementation ships with the display drivers. The input language is PTX you can compile upfront and must be shipped with your application. The necessary DLLs for the CUDA runtime environment, when you use any, also need to be shipped with your application. The customer only needs to have a display driver version installed which supports the OptiX 7 API version you used to build your application.

You have clarified so much for me! Thanks!

Can I double check my understanding on the normal buffer (as a guided layer besides albedo). If I understand the programming guilde correctly, the normals we pass to Optix should be in [-1.0 to 1.0] range. When we visualize the normals in images, we can choose to either convert or clamp to [0 to 1.0] but not for input to Optix. Correct? (I think I’m doing it wrong at the time, I’m passing the visualized image in [0 to 1.0] to Optix as normal layer. I’ll need to fix it if this is not right.)

Yes, as the programming guide chapter 13.11.1 says, the surface normal vector components have values in the range [-1.0, 1.0] in camera space.

The explanation after that is just about the following two images inside the programming guide which cannot visualize that correctly because the negative components cannot be shown. The brighter of the two is effectively how a scaled and biased normal map texture would look like. Neither is the correct input for the denoiser.

Example code calculating that from the usual pinhole UVW projection vectors can be found in one of my OptiX examples:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_denoiser/shaders/raygeneration.cu#L145

The normal buffer might not always help. It should improve the results for highly detailed geometry.
The albedo buffer is the more important channel to improve the denoising result over just the noisy RGB buffer.
Try all three combinations of RGB, RGB+albedo, RGB+albedo+normal buffers in your application and pick the one which works best.

Fantastic advice! Thanks for the insights! Really appreciate it!