Optix Dynamic Geometry Unstable Rendering FPS

Dear all,

I’m trying to write an Optix closehit program with dynamic geometry by loading new meshes for each frame. Then I notice that the rendering frame rate is not quite stable during geometry update, ranging from 1000FPS to 200FPS.

The updateAccel code is quite similar to the one provided in Optix SDK.

void SampleRenderer::updateAccel()
  {
    int numMeshes = model->meshes.size();

    for (size_t meshID = 0; meshID < numMeshes; meshID++)
    {
      if (model->needUpdate[meshID]) {
        TriangleMesh& mesh = *model->meshes[meshID];
        vertexBuffer[meshID].upload((const vec3f*)mesh.vertex.data(), mesh.vertex.size());

        //std::cout << "Upload mesh: " << mesh.vertex[0] << std::endl;
        model->needUpdate[meshID] = false;        
      }
    }

    OptixAccelBuildOptions accelOptions = {};
    accelOptions.buildFlags = OPTIX_BUILD_FLAG_ALLOW_COMPACTION | OPTIX_BUILD_FLAG_ALLOW_UPDATE | OPTIX_BUILD_FLAG_ALLOW_RANDOM_VERTEX_ACCESS;
    accelOptions.operation = OPTIX_BUILD_OPERATION_UPDATE;

    OPTIX_CHECK(optixAccelBuild(optixContext,
      stream,
      &accelOptions,
      triangleInput.data(),
      (int)numMeshes,
      tempBuffer.d_pointer(),
      tempBuffer.sizeInBytes,
      outputBuffer.d_pointer(),
      outputBuffer.sizeInBytes,
      &launchParams.traversable,
      nullptr,
      0
    ));

    CUDA_SYNC_CHECK();
  }

Any ideas to identify what’s wrong and solve the problem?

Thanks,
Ree

Hi @walnut-Ree, welcome!

The most likely cause here is a degraded acceleration structure. This can happen when using the UPDATE operation when things move too far from their original position (meaning the position used for the last BUILD operation). The OptiX Programming Guide mentions a few different ways the accel structure can degrade during UPDATEs: https://raytracing-docs.nvidia.com/optix7/guide/index.html#acceleration_structures#dynamic-updates

How are you timing your render phase? Specifically is it synchronous or asynchronous, and are you using CUDA events or a host-side timer? These framerates are still high enough, even when it slows down, that it could be worth double-checking that your timing code is correctly implicating the render phase and not something else.

BTW we’ve discussed a few different strategies for mixing BUILD and UPDATE operations in a GTC spring talk from 2021, here’s a link: OptiX Advanced Topics | NVIDIA On-Demand

One strategy is to re-BUILD your accel structure after every few UPDATEs. Another strategy is to pick the pose of your mesh very carefully when you BUILD so that it is less prone to degrading when you UPDATE. This can sometimes be achieved by doing the BUILD operation in your mesh’s most extended pose or shape, but this depends a lot on how coherent the motion of your mesh is over time. Other strategies might be to double-buffer your mesh builds, or to limit the amount of motion to a range that suffers less.

I hope that helps!


David.

Hi David,

Thanks for your patience. For timing, I follow the optixDynamicGeometry example code in Optix SDK, and use chrono lib to calculate the CPU time after synchronization, the render phase mainly contains two operations optixLaunch && CUDA_SYNC_CHECK. The timing code block is like below.

while (!glfwWindowShouldClose(handle)) {
      // calc deform_time
      auto t0 = std::chrono::steady_clock::now();
      deform();                                      // modifying vertex positions.
      auto t1 = std::chrono::steady_clock::now();
      render_time += (t1-t0);

      // calc render_time.
      t0 = std::chrono::steady_clock::now();
      render();                                      // caling optixLaunch && CUDA_SYNC_CHECK()
      t1 = std::chrono::steady_clock::now();
      render_time += (t1-t0);

      // calc display time.
      t0 = std::chrono::steady_clock::now();
      draw();                                      // copy to CPU here.
      t1 = std::chrono::steady_clock::now();
      display_time += (t1-t0);

      // calc gl_buffer_time 
      t0 = std::chrono::steady_clock::now();
      glfwPollEvents();
      glfwSwapBuffers(handle);
      t1 = std::chrono::steady_clock::now();
      gl_buffer_time += (t1 - t0);

}

Is that correct? Visually, it looks like okay, maybe the mesh is relatively simple (2763 verts and 14924 faces).

Best,
Ree

This (using a host-side timer) is an okay way to time things, but it’s more accurate and less error prone to insert CUDA stream events, like so:

    cudaEvent_t  start_event;
    cudaEvent_t  end_event;
    cudaEventCreate( &start_event );
    cudaEventCreate( &end_event );
    cudaEventRecord( start_event, 0 /* stream */ );
    ... <launch a kernel> ...
    cudaEventRecord( end_event, 0 /* stream */ );
    cudaEventSynchronize( end_event );
    cudaEventElapsedTime( &time_in_ms, start_event, end_event );

Your comment mentions that the render is synchronized before marking your timer’s t1 end event. The thing to double-check is whether you synchronize after your deform() kernel, otherwise the t1 for deform() and t0 belonging to render() could be set before your deform() kernel has finished running.


David.