Handling Large(?) Indexed Triangular Mesh, Best Practice

Greetings,

I am dealing with a problem in which I have to find closest hits of 100M rays with a static scene consisting 80M indexed triangles (will not use any shading or launch new rays from hits). The rays are coherent and shares same starting point and typically see not more than 150x60 quadrilaterals. My ray source will translate and rotate as the time goes. My initial plan before using OptiX was generating BVHs offline once for 50x30 quads and load the 3x3 roots around my ray source.

My questions are regarding how to handle a large indexed triangular meshes:

(1) Should I leave everything (80M triangles) to RTgeometrytriangles and not deal with partial ray source position based loading? (This will run on Titan V and my estimates indicate that memory requirement is around 1.5GB for BVH.)

(2) If I wish to store/load OptiX generated acceleration structures, will they get flagged as “dirty”? I am ok with loading 3 x 3 acceleration structures which are not marked dirty (each summarizes 50x30 quads) and finding the acceleration structure of those 9 on the go.

(3) For changing the position and attitude of ray source or use a loading scheme like in (2), do I need to destroy and create context for each change?

So sorry, if I am asking trivial things, I am learning OptiX from scratch as of now and I have only read upto 3.11 of the guide so far.

Best Regards,

1.) Normally yes, but the acceleration structures need much more memory than your estimate.

My old and very coarse rule-of-thumb is 1GB per 10 MTriangles after the build. Most likely less in OptiX 6.0.0. Means I’m expecting the 80 MTriangles to fail as one Geometry on a Titan V with 12 GB because the acceleration structure builder needs even more temporary memory during build time.

To overcome that memory limit the recommended approach is to split such huge geometry into smaller GeometryGroups to get multiple bottom level acceleration structures and put these under a top-level Group node.
For example if you can group them spatially like in a 2x2x2 grid, you would have 10 MTriangles per GeometryGroup on average, or in a 3x3x3 grid with 3 MTriangles on average.
If the objects are opaque that should also speed up the traversal because the farther BVHs aren’t visited at all when looking from all angles around the mesh. Just experiment with it.
See this thread about the same question, just with 145 MTriangles.
https://devtalk.nvidia.com/default/topic/1045277/optix/memory-usage-in-multi-gpu-system-nvlink-linux/post/5305407

2.) Not possible. There is no functionality to serialize acceleration structures in OptiX (anymore). That’s unnecessary because building the acceleration structure is faster than loading it from disk.

3.) There is no need to re-create the OptiX context for scene changes. You shouldn’t need that when doing 1.

If you’re starting with OptiX, please also have a look at the OptiX Introduction presentation video and open source examples on github.
All necessary links here: https://devtalk.nvidia.com/default/topic/998546/optix/optix-advanced-samples-on-github/
These examples generate their geometry at runtime and you can easily increase the complexity of the meshes for own tests.

When running under Windows, be careful about the amount of work you’re doing per launch. There is this infamous Timeout Detection and Recovery (TDR) mechanism which kicks in after 2 seconds spent in a kernel driver.
If you planned to shoot the 100 MRays at once, check how long it takes with smaller sizes as well. E.g. if a launch takes longer than a second, do less work more often. Though your use case should easily work on a Volta and a lot faster on the new RTX boards, which actually reach the advertised 10 GRays/s with primary rays.
Do not measure the very first launch which builds the acceleration structures and compiles the kernel. It’s recommended to do a dummy launch with zero dimensions to trigger that separately when everything has been setup.

Thank you very much,
For your prompt reply and also “An Introduction to NVIDIA OptiX” webinar and explaining examples one by one.
For whom it may help in the future I plan to update my findings.