There are 1e8 points uniformly distributed in a 3D space, with the space range from (0, 0, 0) to (uint32_max, uint32_max, uint32_max). Based on these 1e8 points, different primitive types (triangles, spheres, AABBs) are used to construct a compacted BVH tree, where the data point is the center of the primitive. Let a = uint32_max / 1200, where the triangle edge length is 3.5a, the sphere radius is 0.86a, and the AABB width is a. I know that each triangle corresponds to three points, the sphere corresponds to one point and one radius, and AABB corresponds to two points (minimum point, maximum points). To construct a BVH tree, the sphere requires the least amount of data (only 4 float numbers), and I speculate that it has the shortest build time and the smallest memory footprint. However, the following is the build time and BVH size, where AABB has the shortest build time and the smallest size. Why is this? Why are the relative build time and BVH size of the three primitives as they are?

To construct a BVH tree, the sphere requires the least amount of data (only 4 float numbers), and I speculate that it has the shortest build time and the smallest memory footprint.
However, the following is the build time and BVH size, where AABB has the shortest build time and the smallest size.
Why is this? Why are the relative build time and BVH size of the three primitives as they are?

The accelerations structure itself is always a bounding volume hierarchy (BVH) and that uses axis aligned bounding boxes (AABB) over each individual primitive and a hierarchy over them.

That means custom primitive AABB build input is the fastest to build and the smallest because neither the AABB needs to be calculated nor any additional geometric primitive data needs to be stored inside the AS.

Generating AABBs over spheres is faster than over triangles because it’s simpler to calculate the AABB and needs fewer memory accesses.

The AS builder for triangle primitives is the most sophisticated of the three since that is the most often used case. More work is done to generate a potentially better BVH.

Also the time to build AS and the memory requirements depend on the AS build flags, AS compaction, and the underlying GPU hardware. Please always provide your system configuration information with such questions.

I would generally not recommend building a world which has extends (0, 0, 0) to (uint32_max, uint32_max, uint32_max) if you can avoid it. That’s a huge unit range and doesn’t help floating point accuracy when actually using that scene for raytracing.

When setting the OPTIX_BUILD_FLAG_ALLOW_COMPACTION, did you also compact the AS afterwards for your charts?
Your RTX 3090 contains RT cores and should result in considerable memory savings when compacting GAS.

The AS builders contain various optimizations for the different built-in primitive types.

One optimization is that the AS builder can put one triangle into multiple AABBs to reduce the volume the AABBs take. With multiple AABBs for a single triangle that also means it’s possible that the intersection and anyhit programs can be called multiple times for the same triangle, which is undesirable for algorithms gathering data per primitive inside the anyhit program.
That’s why the OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL exists.
Explained here: https://raytracing-docs.nvidia.com/optix7/guide/index.html#acceleration_structures#6101

In this case, one triangle is put into multiple AABBs and a single AABB can contain multiple triangles, which reduce the number of AABBs. I think this should be the reason why triangle has less memory footprint than sphere.

If you want to compare all possible AS build times and sizes, you’d need to build it with all allowed build flag configurations with and without the OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL.

It depends on the application’s use case what would be the best option.