Inconsistency when using getStackSize / setStackSize functions

Hi,

I’m using OptiX 4.1.0 with CUDA 8, NVIDIA drivers 382.05 on GTX 980ti, and have noticed the following issue:

I have to force stack size setting, otherwise I get problems during rendering, i.e. by default I get an issue during rendering (not all geometry is traversed), however if I do:

context->setStackSize( context->getStackSize() );

I don’t have the problem anymore!

I further noticed, that if I query stack size before setting it, I get “5120”, however, the rendering result looks similar to the result when I manually set the stack size to 1024, so I assume that there is a problem with the default stack size handling.

What is most annoying, is that this problem appears only on my GTX 980 Ti. I have tried GTX 970, Maxwell-based Titan X, GTX 1060 and GTX 1080 and even mobile 1070, and didn’t get the issue. My test, originally was in measuring free CUDA VRAM after OptiX launch is called (cudaMemGetInfo). I had pretty large stack size originally (24000) and on 980 Ti there was 0 free V-RAM left after OptiX launch; on all other GPUs, that I mentioned, there was still VRAM left after OptiX launch with the same settings, so I think that there might be another problem with memory allocation during launch, not only the incorrect default stack size.

Finally, I have solved the issue by reducing the stack size to ~ 4000, but I’m still puzzled whether I misunderstand something or there is really an issue with the stack set/get and memory allocation functions (btw, I didn’t have the issue with OptiX 3.8.0).

Any explanations would be helpful!

Thanks,
Maxim

Right, OptiX 4.x stack size behavior has changed over previous versions and this is a known issue.
The background was to allow replacing the optix.1.dll in existing applications with a newer 4.x version and still run. For that the developer defined stack size needed to be increased and that’s what you’re seeing with the getStackSize() result.

Doing context->setStackSize( context->getStackSize() ); will most likely result in a too big stack size.
Instead the recommended solution is to just call setStackSize() with the minimal required size for each application and OptiX version you develop with.

Please have a look into this thread how to determine a minimal working stack size for your application:
https://devtalk.nvidia.com/default/topic/1004649/?comment=5130084

That you see different behaviors among GPUs is simply because that stack size is per GPU core which means the bigger GPUs will use more memory and coupled with different VRAM sizes that might result in excessive amounts of VRAM used. That’s why it’s always recommended to use the minimal necessary stack size.

This is all planned to be resolved more cleanly with a different API method in the future.

Thank you for the explanations! The minimum stack size setting procedure worked well!

I guess I was mostly confused by the selectiveness of the error to my particular GPU, as I thought that I have tested a good variety of GPUs, and only had issues on mine.

What was also weird (and completely unrelated to the topic) is that I’ve missed the release of OptiX 4.1 somehow, which is really much more performant and more stable compared to 4.0.2 (which I actually used for debugging of this issue).

I’ve recently stumbled onto this myself using OptiX 4.0.2. Could you provide any estimates on which api version will likely address this?