Is it possible to to build an acceleration structure in a separate thread while rending in Realtime at the same time? I want to add lots of new vertices into a scene while rendering at high framerates.
Yes you can build an acceleration structure, and also render at the same time, in two different CUDA streams. The catch is that you must double-buffer your acceleration structure. You cannot modify an in-progress render launch, but you can build an acceleration structure for the next render launch while the current one is still running. This means you’ll have two copies of the acceleration structure. One for rendering and one for adding new vertices to be used next frame. Once your current frame finishes, you swap the two acceleration structures and use the newly built one for rendering. Then you can begin rebuilding the old acceleration structure with even more vertices. Does that make sense?
Thank you for answering @dhart !
What you explained does make sense.
How would you go about setting up two CUDA streams? Is it possible to use c++ threads instead?
You create another CUDA stream exactly like the one you’re already using for the other asynchronous OptiX calls which take a CUDA stream argument.
Means like this with the CUDA runtime API https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html
and like this with the CUDA driver API https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__STREAM.html
You want the non-blocking behavior for these streams.
Do not use the default stream 0. That has different synchronization behavior.
You can use C++ threads on top of that to call the optixLaunch with the CUDA stream you use for rendering in your (main) rendering thread and a separate worker thread which calls the optixAccelBuild on the other CUDA stream in parallel on another buffer receiving the new acceleration structure.
Since all OptiX calls taking a CUDA stream argument are asynchronous, you need to apply the correct stream synchronization calls to make sure each stream has finished accessing the buffer memory you want to exchange.
Mind that this will still compete over GPU resources, so while the two separate kernels will be launched asynchronously and be handled by the GPU as quickly as possible, there is still only one GPU doing all the work. YMMV.
Swapping the accelerations structures before your next optixLaunch means to set the newly created traversable handle inside the launch parameters.
Also note that CUDA allocation calls are synchronous. Means if you’re constantly changing the size of the acceleration structure when adding or removing geometry, make sure that you manage your memory allocations in a way which is avoiding unnecessary synchronizations.
Thank you, that clears things up for me.
I was was not aware before that the acceleration structure was built on the GPU.