Shader breaks when adding float array of size 1 to trace ray program in debug mode with OPTIX_EXCEPTION_FLAG_STACK_OVERFLOW

I recently went through ingo’s optix7course github repo. my version is here

it has a few updates for making the cmake work out of the box(i.e. just open the root directory with msvc and go).

one issue i had was with the last example, example 12, where if i had a working devicePrograms.cu(its in a state where its basically the version from example 11) but if i added a variable like float jawn[1]; in the trace ray program, it would break. break as in the screen would be black. it would still compile just the result was black.

I tracked it down to OPTIX_EXCEPTION_FLAG_STACK_OVERFLOW getting OR’d in with the pipelineCompileOptions.exceptionFlags when in compiling for x64-Debug.

Not sure why this might be. The code should be in a state that should be reproducible(obviously not a minimal repo example however). The line to comment/uncomment to get it to work/not work is line 314 in SampleRenderer.cpp . OR’ing in the OPTIX_EXCEPTION_FLAG_STACK_OVERFLOW flag will get it to break.

if you uncomment line 314 in SampleRender.cpp and get it to break, you can get it to work again by commenting out the float jawn[1]; line 255 in devicePrograms.cu or changing it from float jawn[1]; to just a float jawn;

im using msvc 2022 version 17.9.7, optix 8, cuda 12.5, rtx 3090, gpu driver 555.85, windows 11 pro 23H2 22631.3672

EDIT: if you end up trying to repro, you need to full rebuild example 12 every time you change its cu files.

When you get an OPTIX_EXCEPTION_FLAG_STACK_OVERFLOW exception, the first thing to look at is the optixPipelineSetStackSize function.
That is using hardcoded values in your code: https://github.com/loshjawrence/optix/blob/main/example12_denoiseSeparateChannels/SampleRenderer.cpp#L439

That is simply the wrong approach and is similarly broken inside the original code.

The correct approach is to calculate the required stack space accurately at runtime by using optixProgramGroupGetStackSize on all program groups inside the current pipeline and then calculate the required stack space from all results.

The OptiX SDK provides helper functions for this inside the optix_stack_size.h header and shows how to use them inside various OptiX SDK examples. You could also program that yourself.

The OptiX stack size is affecting the memory requirements and performance of your OptiX pipeline and that’s why it should be calculated as precisely as possible.
Note that debug device code and OptixModuleCompileOptions debugLevel influence the required stack size where debug settings require more!

What probably happened, is that the float array you added went over the defined stack size in debug mode because it was allocated as local memory, while the simple float variable might have been implemented as register.

In any case, whenever you run into OPTIX_EXCEPTION_FLAG_STACK_OVERFLOW, correct your OptiX stack size.

I don’t have a development environment setup to build your project and your example12_denoiseSeparateChannels code is vastly different than the original, so I didn’t try merging things to the original example code which I had running. There was also some unresolved ImGUI dependency.

If you experience problems with any of the original optix7course examples, please raise them as issue inside the github repository. These are Ingo Wald’s personal examples (note the copyright).

1 Like

awesome thank you very much, and thanks for the thorough explanation. I should have searched “stack” in the program , then i would have seen that call you mentioned and put 2 and 2 together. I’m glad I asked though since now I know the best practice way of getting those values, as opposed to bumping the upper limit every time there’s an issue.