OptiX 6 performance for (near) empty scenes


I am currently trying to optimize my OptiX 6.5 application and noticed that launches for completely empty scenes (only an empty group as top object) take a fairly long time.
I then modified the optixTutorial0 to see how performance is there on an empty scene, and get the following stats on my end:

Resolution: 2880 x 1600 (Vive Pro eye res)
Scene/Shader/Raygen: Basic sample tutorial 0 shader, empty scene, background color as miss.
GPU: Geforce 1080
CPU: Intel Xeon E5-1650 v3 @ 3.5Ghz, 6 Cores

With this system I get an approximate launch time of 2ms, which seems fairly high to me for an empty scene.
Is this the expected performance, and is there a way to improve that?

Additionally, I need to trace twice (once for each eye). I was doing two consecutive launches before, but tried optimizing this by only doing one launch and calling rtTrace twice in the same raygen program. This seems to be slightly faster (approx 3ms) than two consecutive launches. Are there other optimizations I could look at to speed this up more, because a 3ms base pass is still fairly high for me.

Thanks a lot in advance,

Hi David,

Being a somewhat high level API, OptiX 6.5 does a variety of things during your launches that might not be immediately obvious. The first launch will compile all your shaders, and subsequent launches can also compile shaders if any changes trigger recompilation. Each launch will also reconcile all scene graph changes and do all memory allocations that need to happen. Different combinations of host-to-device and device-to-host memory transfers can occur during an OptiX 6 launch, and changes to your rtVariables can trigger larger memory transfers than you might initially expect.

I recommend when measuring launch times to explicitly record multiple identical launches and compare subsequent launch times to the first launch time. Otherwise you may be recording overheads that you didn’t know about or expect, but are required for launch, such as shader compilation. It’s very common for the first launch to seem large, and subsequent launches to be small and closer to what you were expecting.

Is it a reasonable time for you to consider OptiX 7’s lower level approach? One of the reasons we changed the OptiX API so dramatically is because many people were, just like you, surprised at some of the launch times, because there is a lot of hidden implicit work OptiX 6 is doing under the hood. With OptiX 7, you have more explicit control over compilation, memory allocation, and memory transfer, so there are not only fewer surprises, but you can also generally optimize your code more easily by matching it more closely to what your application is doing. By all accounts so far, once you get over the learning curve in OptiX 7, optimizing becomes easier.



Thanks a lot for the quick answer. I figured that the lower level API of OptiX 7 would give more fine control, but sadly cannot transition to OptiX 7 fully for other reasons.
I’ll just try and optimize it as best as possible and scale down the resolution in the worst case.

Best regards,