Unending memory allocation with Trbvh on launch in OptiX 3.6.0

My scene graph consists of a Trbvh top group, with several thousand heavily instanced triangle meshes under transform nodes, each with an individual Bvh/Bvh acceleration ( most of the meshes have 16-bit indices, which don’t seem to play well with most of the other builder types ). Total of around 6.7 million triangles.

In OptiX 3.5.1, this configuration successfully compiled, launched, and rendered. After updating from OptiX 3.5.1 to OptiX 3.6.0 and CUDA 5.5 to 6.0, however, this setup stalls on the call to launch while continually allocating memory. Somewhere around 4.5 gb by the task manager, it hits an exception over failure to allocate memory.

Windows 7, 64 bit, GTX 650, Driver 337.88.

Will you retry this with 3.6.3? We fixed many allocation + Trbvh errors, probably including this one.

And you’ll probably be better off paying the extra 36MB to use 32-bit indices so that you can use Trbvh everywhere. Also you could not specify the vertex buffer name and index buffer name, so that it runs your BoundingBox program. You can then keep your 16-bit indices, while still using Trbvh. You just won’t get as good of splits if you’re splitting AABBs instead of triangles.

And the biggest thing, try to have as small a node graph as possible. Flatten as much as you can into a single geometry group.


I am using Optix 3.7.0 stable version and I have the same problem with Trbvh. My scene has a selector at top level and 2 geometry groups below it. Each geometry group has Trbvh acceleration builder. One of them has small geometry and there is no problem. Another group contains larger geometry: for example, 5000 models with 400 triangles in each. I noticed that if the number of triangles exceeds ~2.6 million, I get the exception on GTX 980 (4 GB of memory):

Invalid context (Details: Function “_rtContextLaunch2D” caught exception: Encountered a CUDA error: [15466853] returned (4): Deinitialized, [15270150])
Sometimes there is driver crash, not just exception.

I also tested it on Quadro 2000 (1 GB memory) and maximum number of triangles decreased proportionally: it was only 650000. All GPU memory is never occupied during the acceleration build, only between 1/4 and 1/2 of it.
It seems like GPU memory fragmentation happens. Is it some inefficiency in builder or is this its maximum possible geometry size? I also tried Sbvh builder and I didn’t face this problem with much larger geometry.

The Trbvh builder can use a considerable amount of GPU memory during the build process, that is why it can be built in chunks.

Please read the OptiX programming guide on the Trbvh builder in chapter “3.5.2. Builders and Traversers” and the “chunk_size” parameter if you run into memory related limits when building the acceleration structure. Also please note the differences for the OptiX commercial license version.

There are bug fixes for the Trbvh builder inside the upcoming OptiX 3.8.0 release which are not inside the beta, yet, but if you experience issues with Trbvh in OptiX 3.6.x or 3.7.0 releases it’s worth to try OptiX 3.8.0 beta and the release once available.

Try OptiX 3.8 final release. We fixed several memory-related Trbvh and OptiX Prime bugs. I bet this is fixed.