I tried to replace optixTrace with optixTraverse and optixInvoke in my raytracing-project.
The program did work with optixTrace in Release and Debug mode. With optixTraverse and optixInvoke it is only running in Release mode.
In the Debug mode, the first pixels are calculated but than the program hangs. I looks like it does not return from optixTraverse or optixInvoke and the GPU is running on 100%.
I use the driver 572.70 and Optix 9.0.
Hi @bvb70,
Do you have a way to share a minimal complete reproducer? How are you setting debug mode? What CUDA toolkit version are you using? Is your trace/traverse call inside a conditional block? Do you use optixReorder
when calling optixTraverse
?
I tried using the same driver and OptiX 9 on the optixPathTracer
sample, which calls optixTraverse()
by default, and it seems to execute just fine (though very slow). To set debug mode, I toggled on the cmake variable OPTIX_DEBUG_DEVICE_CODE
. In the OptiX SDK, using Visual Studio’s debug build setting will only affect host code compilation and will not really affect how device code is compiled.
Does your kernel run slowly in debug with optixTrace? Normally device code compiled in debug will execute extremely slowly, this is to be expected. For example, optixPathTracer
runs at less than one frame per second in debug, compared to hundreds of fps in release. Just checking if this is could be a case of your debug kernel taking, say, 1 minute, rather than 1 second?
–
David.
BTW, an extra note of caution when testing this kind of thing, since I just bumped into it myself - the shader cache can make it hard to understand when things are working correctly. It’s easy to switch configs from debug to release, and have the app load an old cached debug shader when you expected a release shader, or vice-versa.
It may be worth using optixDeviceContextSetCacheEnabled(..., false)
to disable your shader cache while you investigate this issue.
–
David.
It is difficult to share a minimal reproducer. I used the path tracer example from the SDK and parts of the optix7 course examples as templates, but until now I could not reproduce the problem in on of the examples.
I use Cuda 12.8.
The trace/traverse call is not inside a conditional block.
It is happening with and without optixReorder.
I noticed, that I had commented out the line
options.validationMode = OPTIX_DEVICE_CONTEXT_VALIDATION_MODE_ALL
because the performance was too slow. Without this VALIDATION_MODE_ALL option the program was running (with 1/25 of the frame rate in release mode) when I use optixTrace().
The other debug options I use are
module_compile_options.optLevel = OPTIX_COMPILE_OPTIMIZATION_LEVEL_0;
module_compile_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_FULL;
The program with optixTrace() is also running with
options.validationMode = OPTIX_DEVICE_CONTEXT_VALIDATION_MODE_OFF;
but it freezes, when I use
options.validationMode = OPTIX_DEVICE_CONTEXT_VALIDATION_MODE_ALL;
With optixTraverse() and optixInvoke() it always freezes in debug mode, no matter if options.validationMode is not set, set to MODE_OFF or MODE_ALL.
By freezing I mean, there is no response after several minutes, whereas it is running in release mode with 114 fps and in debug mode (with optixTrace) 4.1 fps.
I changed to a simpler scene and a smaller image size: here the first frames are calculated, but the program freezes when I change the camera-viewing-direction.
This sounds tricky and frustrating to debug! :P Which GPU are you using?
We’re not aware of any issues with optixTraverse() causing freezing in any situations, and we use it and test it in many different situations. If it is happening, we will absolutely take it seriously and fix it, but since we don’t see it, we need to figure out how to reproduce the issue in-house. There is of course the possibility this is some kind of misconfiguration or race condition in your application code that has misleading symptoms that make it look like optixTraverse is freezing, so we should work to rule that out completely.
Validation mode in OptiX is indeed slow, and it is a debug feature not intended to be used for release or production code. Do make sure to turn it off when not needed. Same goes for device debug code, it will be very slow and should only be used while debugging.
Here are a few triage ideas I can think of to start with:
- CUDA_LAUNCH_BLOCKING=1 (environment variable)
- CUDA toolkit 12.7 (or even 12.0)
- Driver 572.83 (or earlier than 570 if you have a Turing through Ada GPU)
- Toggle between PTX and OptiX-IR
- Disable features until things work, find out what is different in your app vs SDK samples. It could help to find out if this happens only when (for example) reflections or shadows are turned on, as opposed to primary rays. Or if this only happens in the presence of certain shaders.
By the way, how are you isolating the optixTraverse()
call?
–
David.
how do I turn of the validation mode for optixTraverse()?
From my experience I would say that
options.validationMode = OPTIX_DEVICE_CONTEXT_VALIDATION_MODE_OFF;
is not working when I use optixTraverse() ?!
Using OPTIX_DEVICE_CONTEXT_VALIDATION_MODE_OFF
is sufficient. And validation mode is opt-in: the default is off, so if you don’t turn it on, it will be off. If your freezing is still happening with validation mode off, I guess that just means the freezing is not related to validation mode. I assume validation mode is not giving you any warning or error messages?
The triage tests I mentioned above should all be done with validation off. Let me know if any of those cause a change in behavior, or if you have a way to share a reproducer so we can debug.
–
David.