The primary reason multiple small GAS require more memory than a single large one is because each GAS has a little bit of overhead in the form of a header. The overhead is negligible for a large mesh GAS with thousands or millions of triangles, but it might become noticeable if you build a GAS over a very small mesh. For example, you will see virtually no practical memory savings if you combine 3 meshes of 10k triangles each into a single mesh. On the other hand, you will see a relatively larger memory savings if you were to combine 10k GASes of 3 triangles each all into a single GAS with 30k triangles.
I hoped for a guess on the AABB structure when splitting geometry.
The overhead traversing through separate overlapping hierarchies is true for any hierarchy type, it is not related to AABBs or to any OptiX implementation details. it is purely a function of hierarchical tree search being a logarithmic operation.
Here’s an example of the worst case. Imagine you have a GAS with a decent amount of geometry in it, say a million triangles. Suppose you want to use it as an instance, and place multiple copies of the instance in your scene. Now consider the case when you place two of these instances in almost the same position, so they overlap almost completely. In this case, a ray that passes through these two instances will need to traverse each one separately. If instead you can merge these two overlapping instances into a single GAS, then the ray can traverse the single GAS one time.
Let’s use a hypothetical binary KD tree. For 1 million triangles, in the ideal case you might expect to traverse about 20 nodes in the tree on average for a ray that hit a triangle, because 2^20 ~= 1M. If you have 2 instances overlapping and you have to traverse them independently, then you will expect to search 20+20 = 40 nodes. If instead you took these two overlapping instances and built a flattened single hierarchy with 2 million triangles, then you can traverse this combined hierarchy by examining 21 nodes. So your traversal of the overlapped instances takes approximately twice as long as the traversal of the GAS built with combined meshes.
You said fewer overlaps lead to higher speed and so I thought there is also a memory downside with that.
Note the memory overhead of many small GASes, and the compute overhead of traversing overlapping GASes are completely independent and separate problems. But, there is a single solution that happens to solve both problems at the same time. When you combine meshes into larger meshes, you reduce both the GAS memory overheads and the traversal overlap overheads. There’s no memory downside to merging two different meshes into a single mesh, but there is a memory downside to combining two or more instances of the same mesh into a single one. In that case you’ll use almost twice the memory, because you’ll need to keep two copies of the mesh instead of using two instance nodes. You also lose scene flexibility when combining meshes, for example if you want to animate some of the meshes independently, then merging them naively might lead to excessive BVH build times compared to maintaining separate GASes.