Out of memory recovery


I am using Optix to load multiple models on the GPU to render them alternately without rebuilding the acceleration structure everytime. It works greatly until I try to load a model which will not fit in memory. An out of memory exception occurs, which is understandable, but I would like to be able to prevent this error from happening.
I have seen here https://devtalk.nvidia.com/default/topic/1045986/optix/issues-while-loading-big-textures-2-4gb-/ that it is not possible to recover from an out of memory exception and that we need to rebuild the optix context from scratch.
Is there a way to prevent this ? For example by estimating the space the acceleration structure will take on the GPU ?



Unfortunately, there is not currently a way to do this. We are working on features that will help with this in a future release. To mitigate this problem, you can try breaking your scene into smaller chunks. The out-of-memory problem usually occurs during Acceleration build. There is a chunk of temporary memory proportional to the size of your BVH which is allocated when you build an Acceleration. If you have a few medium sized BVHs instead of one large one, the high-water mark of your GPU memory usage can be significantly lower.



This won’t help you recover, but it might help you avoid the error. Try:

rtAccelerationSetProperty( accel, “compact”, “0” );

Compaction is there to reduce memory usage, but it may increase your high watermark temporarily.

By the way, BVH compaction and the “compact” BVH property are new and part of OptiX 6, this advice doesn’t apply if you’re using OptiX 5.1 or earlier.

You can find documentation of the BVH build properties in the Programming Guide here: http://raytracing-docs.nvidia.com/optix_6_0/guide_6_0/index.html#host#acceleration-structure-properties


Thanks, I will check this out.
Is it possible to estimate the space the acceleration structure may use depending on the size of the scene, even very roughly ? I did some test and the acceleration structure seems to vary between models, but I was hoping I could check the size of the vertices and triangles of the scene before building to determine if it will fit on the GPU using getAvailableDeviceMemory.

Currently, I just check if the memory use of the scene on CPU do not exceed a third of available memory on GPU, its working for most of models, but I guess it may fail on certain particular case.

I’m not sure I understand what you mean about checking the size of the vertices and triangles before building. Are you using the OptiX mesh sample code to load meshes? Normally, you will have the size of the vertices and index buffers before building, since those sizes need to be given to OptiX to create the buffers. If you’re using the OptiX mesh loader, you can get the mesh size information from the OptiXMesh class members, e.g., getNumVertices(), etc.

Are you using RTX hardware and OptiX 6, or some other combination? Are you using the geometryTriangles API for your meshes, or your own custom format & intersection program? You might experiment and see if using RT_GEOMETRY_FLAG_NO_SPLITTING with rtGeometrySetFlags() or rtGeometryTrianglesSetFlagsPerMaterial() makes any difference to your memory usage.

Your one-third rule of thumb makes sense. It is plausible that with compaction enabled (which is on by default), the temporary high watermark of an Acceleration Structure could be roughly 3x the size of the original mesh data, one copy for your data, one for the BVH, and one for compaction. After the build, usually two of those copies can be freed. With compaction disabled, the third compacted copy wouldn’t be allocated in the first place.

Right now we don’t have a more reliable way in OptiX to estimate the BVH size than what you already figured out. A better way to do that is coming, as Keith mentioned. If 3x the size of your mesh isn’t always working, for now I would suggest trying 3.5x or 4x until it’s always working. If you can break your large mesh(es) into smaller pieces as Keith suggested, you’ll have a lot more temporary space to work with, and be able to fit a lot more geometry on your GPU, since you’re entirely limited by the available free space at any given time. One other idea I can offer is to sort your meshes by size, and load from largest to smallest. That might help if you have a wide variety of sizes.

Also, just to be clear, compaction will only cause a higher watermark if you have a very small number of very large meshes. If you only have one large mesh, then disabling compaction is a good idea. If you have more than a few meshes, then compaction is going to reduce your maximum memory usage because it will compact some of your meshes before building the rest. Generally speaking, if you want to save memory, compaction should be enabled. Can you give some specifics on the number and sizes of meshes you’re trying to load, along with your GPU memory size?

Yes, I meant the size of the vertices and indices buffers in bytes that I am passing to Optix API. Right now, I am using Optix 5.1.1 because of drivers compatibility but I will get to Optix 6.0 when it will be possible.
I don’t really have a fixed number for the number of meshes, I am just trying to fit as much as possible.
A whole mesh indices and triangles buffers can take up to 300 MB in RAM on the host. I have problems when I try to use a GTX 970 with 4GB, after loading roughly 10 models between 10 to 400 MB.

Ah I see, thank you. Well then I’m offering too much advice. Disabling compaction and the splitting flag I mentioned don’t apply to OptiX 5.1. Stay tuned for updates to the API that might help you. In the mean time, with meshes up to 400MB in size, you might need to keep as much as 1.5GB reserved. So if you can break meshes into smaller chunks, you’ll be able to fit more. With 100MB meshes, you’d only need to reserve 300-400MB, and with 10MB meshes, you could get within 30-40MB of full - so with small meshes you’d be able to fit probably more than 1GB more mesh data than with the large 400MB meshes. Make sure to split meshes spatially, and not arbitrarily or randomly, so the BVH overlap between the chunks is minimized.

Thanks for the help, I will see what I can do, waiting for a next update then !