Dynamic mesh feature request

I’m building an application in which I procedural generate the mesh depending on the camera distance to the parts of the mesh. After experimenting with the Bottom-AS builders it has come to my attention that building them in the GPU causes frames the drop and the application becomes non-responsive depending on the sizes and the counts of the mesh parts that are being built.

I’m wondering if the problem is caused by my code or does it really take that long to build the structures ?

For reference the mesh parts that are being built are something like this:

64 pieces (All of them are supposed to have different geometries)

  • 50x50 vertices and tex coords
  • 400-500 triangles (depends on the state)
  • 250x250 texture

The duration of the lag spike is around 1.0 seconds to 1.5 seconds on a 2080ti. I’ve also observed that this duration drops to 0.4 to 0.8 when building only 3 pieces

During my first pass I create the acceleration structure like this:

optix::Acceleration acc = context->createAcceleration("Trbvh");

If a part is deleted I keep the acceleration structure but I update the information when a new piece is generated and mark it dirty like this:


So my feature request would be the ability to build the acceleration structure in an independent builder, like a CPU builder. The way it would be used is like this:

  1. Program passes triangle information to a worker and performs a check on the worker when convenient (ie. between render passes)
  2. Worker uses the triangle information to build the Bottom-AS on the CPU (preferably another thread) and terminates when done
  3. If the program acknowledges that the worker is done, it takes the raw (perhaps byte code or a handle to a byte code) Bottom-AS and pushes it to the GPU Acceleration structure, therefore eliminating the need to build on GPU,


Please add the following usage report code to your application and check the output.

I don’t expect that just building the acceleration structures for a few hundred primitives takes that long.

Most likely the problem is that you’re generating additional variables which will require recompiles of the device programs.

If you can generate all variables and buffers at the necessary (maximum) size a-priori and only exchange the geometry data inside the buffers and set the new primitive count (no need to re-create the buffer!), then you should just have the acceleration structure build time.

If this is on an RTX 2080Ti, are you using OptiX 6.0.0 and GeometryTriangles nodes?

Please read this thread as well: