I’m having problems with traversing single GAS on Quadro RTX cards. As I’m building my acceleration structures I only wrap in an IAS if I have more than one GAS otherwise I trace just with the GAS handle. (I guess as a first question does this actually improve performance? or should I just wrap my single GAS in a IAS even if there is no transform or motion blur?)
Now I’ve tried this on a couple of different machines and on
GeForce GTX 1050 Ti with Max-Q Design,
Quadro P4000
it works fine and the hits are reported as expected both with single GAS handle as well as with multiple GAS wrapped in an IAS. But when I run this on
Quadro RTX 4000,
Quadro RTX 6000,
Quadro RTX 8000
cards the single GAS handle doesn’t report any hits only when I wrap it in an IAS. I’ve tried with both Optix 7.1 and 7.2 on a couple different 450 drivers as well as with the new 460.89.
An easy thing to check is whether these two SDK samples are running for you: optixPathTracer and optixCutouts.
The optixPathTracer sample runs a single GAS. The optixCutouts sample runs with a GAS wrapped in an IAS.
The mostly likely scenario is that there is a setting or configuration option somewhere that is causing your failures. These two samples can provide a reference point for the setting you might be missing.
Using a single GAS can be a little faster than using an IAS+GAS. Whether it’s worth having a 2nd code path to support the single GAS is up to you, but it does complicate things a little, and might not be worth it if you typically have the multi-GAS case.
But when I run this on [RTX] cards the single GAS handle doesn’t report any hits only when I wrap it in an IAS.
I am not sure I understood correctly – it is the instancing case on RTX cards that is failing for you? Everything works correctly when you use a single GAS, and fails for multiple GAS + 1 IAS?
Have you checked your SBT indexing to make sure it’s not a case of a bad index? When the SBT index is wrong, null pointers in the SBT are silently skipped (since this is a valid use case) so it can look like missing geometry as the closest hit program will not be called. You can check whether intersection is actually occurring by rendering your raygen program’s clock cycles to the screen, or by using a custom intersection program instead of OptiX triangles. Another test would be to put intentionally bad data in your SBT and make sure your program crashes.
Another thing to check is the optixPipelineSetStackSize() call, since you need to pass the maximum traversable depth.
Thanks for your answer, will try the samples tomorrow.
Everything works correctly when you use a single GAS, and fails for multiple GAS + 1 IAS?
The other way around, I get no hits when I use a single GAS but when I use GAS + 1 IAS I get the expected hits. Which I guess rules out IAS related indexing and stack size.
Running the exact same code on the other cards (non RTX) I’m getting the expected hits both in the single GAS and the GAS + 1 IAS case.
Tried setting OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_GAS and the maxTraversableGraphDepth to 1 in the stack size and it works, I get the expected hits. But it’s not feasible to rebuild my pipeline every time I go from single GAS to multiple GAS + 1 IAS and vice versa.
Anyway I’m guessing this is not intended behaviour given the word ALLOW in “OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING”?
Is there any guidance anywhere on how to run the samples without installing Optix/CUDA toolkit? The RTX machines are not my dev machines. I built the MinSizeRel and copied over the “nvrtc64_111_0.dll” and “cudart64_110.dll” together with everything in the build folder but it doesn’t seem to be enough to get the samples to run.
Worth mentioning is I’m remoting in to these RTX machines through windows “Remote Desktop Connection” and they don’t actually have a screen connected to them, not sure that has any impact on Optix but maybe the glad/glfw part of the samples. It has proved troublesome with the nvidia control panel.
It is intended behavior to need to specify which use cases you allow separately, the wording wasn’t meant to suggest that single gas is a subset of single level instancing. They need to be enabled separately. But they can both be enabled at the same time, if you want to support both in the same pipeline.
Is there any guidance anywhere on how to run the samples without installing Optix/CUDA toolkit?
You will only need the OptiX SDK and CUDA toolkit to compile OptiX programs. To run them on a different machine, the only requirement is that the machine has the required driver. You don’t need either the OptiX SDK or the CUDA toolkit to run an OptiX program – unless you are using nvcc or nvrtc to compile your shader programs at run time. To avoid that, you can configure your build process to pre-compile your OptiX programs, and bundle the PTX with your application before distributing it to non-dev machines.