Black box rendering errors

Howdy I’m getting black boxes in my renders. They only happen occasionally and can appear in different locations on the screen. I hit a similar problem a while back which was caused by passing in vertex normals that were NAN but I have trapped them now and don’t think that’s the cause. They don’t appear immediately after the render starts but usually after minutes of accumulated frames. Can anybody tell me what might be the problem and how I might track it down?

GeForce RTX 3090 driver version 545.84
OptiX 8.0
Cuda 12.3
Microsoft Visual Studio Community 2022 (64-bit)
Version 17.7.6
Windows 11 Pro

If this happens when using the OptiX denoiser there has been a similar issue reported before when using incorrect values for the OptiX denoiser:

If that’s not it, the usual debugging steps would need to be tried first.

1.) Add an exception program to your OptiX pipeline and enable all available exception flags.
Check if there are any exceptions.

2.) Check if any of the output values inside your output buffers are NaN, negative, or INF.
Example code doing that:

3.) Enable debug level 4 and the validation mode inside the OptixDeviceContextOptions and check if OptiX complains about anything.
Enabling validation mode adds synchronization calls (and makes things slower). If that solves the problem there might be some synchronization call missing in your application, like something copying data which isn’t finished rendering yet.
Make sure your CUDA streams are used consistently with asynchronous OptiX API calls (== all functions which take a CUDA stream argument).

4.) Make sure all your per-ray data is correctly initialized before calling optixTrace.

5.) Try isolating a launch index where this happens and try to find what value the corruption actually is by adding CUDA printf() instructions for only that launch index into your device code which leads to that faulty value.

OptixDenoiserParams denoiserParams = {} seems to have fixed the problem. Thanks for the quick help!

Oops. I spoke to soon. They are still there, I just didn’t wait long enough for them to appear. I updated to driver version 546.17 and it did not help. I’ll try your debugging steps.

Have you isolated first if they come from the denoiser or the renderer?
That is, does the corruption disappear when not using the OptiX denoiser?

If you have occurrences where you didn’t default-initialize OptiX structures, please go through your whole source code and verify that all OptiX structures are default initialized. Otherwise there can always be issues when changing OptiX SDK versions which added new fields to API structures.

No black boxes with the denoiser disabled. I only need to initialize OptixDenoiserParams once before creating the denoiser correct? With the denoiser enabled, if I wait long enough the black boxes will almost completely cover the render.

OptixDenoiserModelKind denoiserModel = OPTIX_DENOISER_MODEL_KIND_HDR;
OptixDenoiserParams denoiserParams = {};
OptixDenoiserOptions denoiserOptions = {};

optixu::GuideAlbedo useAlbedo = optixu::GuideAlbedo::Yes;
optixu::GuideNormal useNormal = optixu::GuideNormal::Yes;
optixu::Denoiser denoiser = ctx->optCtx.createDenoiser (
denoiserModel, useAlbedo, useNormal, OPTIX_DENOISER_ALPHA_MODE_COPY);

Yes, they need to be initialized once before calling optixDenoiserCreate().

Other than that, I have no idea what your code is doing. I neither know your optixu namespace functions nor can I see if your denoiserParams and denoiserOptions are even used.
Your createDenoiser() function is not matching the optixDenoiserCreate() function signature, so that isn’t a direct wrapper.
You have the parameters there which you would normally set inside the OptixDenoiserOptions directly, so does that happen inside the createDenoiser() function and does that initialize a potentially local structure again correctly?

How do you initialize the denoiserParams fields hdrIntensity, hdrAverageColor, blendFactor, temporalModeUsePreviousLayers?

The OptiX SDK optixDenoiser example should show the correct setup.

You might also want to change your implementation to the newer AOV denoiser modes once you found the problem.

Looking inside the original wrapper library code you’re using, the structures seem to be correctly initialized before the optixDenoiserCreate() resp. optixDenoiserInvoke() calls.

OptixDenoiserOptions options = {};
OptixDenoiserParams params = {};

So that would make your local version of the structures unused?

OptixDenoiserParams denoiserParams = {};
OptixDenoiserOptions denoiserOptions = {};

Please verify that you’re on a source code commit version of that wrapper library which contains all necessary denoiser initialization steps.

It would be interesting to see if adding a synchronization after the optixDenoiserInvoke() inside the wrapper library fixes the issue.

Please check if the corrupted rectangles matches a tiled denoising partitioning.
I have not followed how that library is handling that splitting into DenoisingTasks.
Check if that calls optixDenoiserInvoke() more than once per image, although it might not be necessary for your image resolution.
If it does, try letting it denoise the whole image at once.
If that solves the black rectangles, check the tiling approach inside the wrapper library for synchronization errors.

Yes, sorry I forgot to mention that I’m using the OptiX_Utility wrapper. I checked and I am using the latest release on GitHub. Adding a synchronization after the optixDenoiserInvoke() inside the wrapper library did not fix the corruption although it seems to take longer to appear now. My last test took 5 minutes of rendering before the corruption appeared.

My output resolution is 1423 x 800 and I am not using the denoiser tiling options. Thanks for the help.

Yes, there must be a problem in my code. I am getting a bunch of NaNs for payload->contribution in my closest hit program .

The NaN rendering error is fixed but the corruption still happens if the denoiser is enabled. I switched to the AOV denoiser model but that didn’t help nor did compiling using ptx instead of optiX-ir. I’m going to next try using the temporal denoiser since that’s probably best for my project anyway.

The original error in the linked post were uninitialized hdrIntensity or hdrAverageColor values.

I would start by analyzing all OptixDenoiserParams values first before changing the denoiser mode.
Try hardcoding the hdrIntensity and hdrAverageColor values vs. letting the denoiser calculate these (CUdeviceptr == 0).
For the HDR denoiser hardcode the hdrIntensity (single float) and set the hdrAveragColor pointer to zero.
For the AOV denoiser, set the hdrIntensity pointer to zero and hardcode hdrAverageColor (float3) values.
If that changes the behavior, check if the calculated input values are reasonable during the run.

The HDR denoiser expects input beauty images with color components in the range [0, 10,000].
The albedo buffer components must be in the range [0, 1].

If you’re using multiple inputs like RGB+albedo+normal try reducing the number of inputs to RGB+albedo and RGB to see if the behavior changes.

I found the problem. Pilot error as usual.:)

I hardcoded the hdrIntensity but any value I set had no effect. But when I stopped inputting the first normal, the corruption stopped. That made me examine my code more closely and I discovered that I was setting the “first hit normal” to the most recent normal by mistake.

Thanks for the excellent support as usual. Sorry to waste your time.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.