GAS rectangle vs triangle

I understand rectangle is the built-in shape for RT core, but what if it is for a GPU without RT core? Will there be a performance difference using triangle vs rectangle build with OptixAabb? To represent a rectangle with two triangles, I will need 6 vect3, and for a rectangle, I only need 4. The memory benefits are straightforward.

Also, I went through the optix example code, The only example of using OptixAabb is to build a sphere. I still do not quite know how to build a rectangle using optixAabb.
Follow are from the programming guide, but I still don’t know how to define rectangles for d_aabbBuffer. Can you provide an example?

OptixBuildInputCustomPrimitiveArray& buildInput = buildInputs[0].aabbArray;
buildInput.type = OPTIX_BUILD_INPUT_TYPE_CUSTOM_PRIMITIVES;
buildInput.aabbBuffers = d_aabbBuffer;
buildInput.numPrimitives = numPrimitives;

I understand rectangle is the built-in shape for RT core, but what if it is for a GPU without RT core?

No! The only built-in geometric primitives in OptiX are triangles and curves (linear, quadratic and cubic B-splines).

Please read the OptiX Programming Guide again. This is said in the second sentence of the first chapter Overview

Will there be a performance difference using triangle vs rectangle build with OptixAabb?

For the acceleration structure build itself? No, not really.
But for the runtime performance the performance difference between built-in triangle and custom primitives will be dramatic on RTX devices because these have hardware support for ray-triangle intersections,
For devices without RT cores there are highly optimized intersection routines for triangles (and curves on all devices) inside OptiX which are hard to beat in performance at the implemented precision.

To represent a rectangle with two triangles, I will need 6 vect3, and for a rectangle, I only need 4. The memory benefits are straightforward.

That is actually the smaller part of possible memory savings.

First, you don’t need to specify the triangles as independent triangle array with three vertices per triangle. You would normally define indexed triangles, which means you have the same four vertices and use six indices to build two triangles, sharing two of the vertices.
This is especially efficient in fully connected triangle meshes where the internal vertices are reused for six triangles!
This is the standard method for mesh definitions in rasterizer and raytracer APIs.

The more important memory saving between these two geometric primitives is actually that a custom rectangle primitive needs only one axis aligned bounding box (AABB), and two triangles need two AABB. That can make quite a difference for the acceleration structure (AS) size and if the only concern is memory usage, it’s a valid method to use custom rectangle primitives with your own intersection program at the cost of a considerable runtime performance hit from not using hardware ray-triangle intersections.

But also note that geometry AS can be compacted and esp. RTX devices can compress them very efficiently.

Also, I went through the optix example code, The only example of using OptixAabb is to build a sphere. I still do not quite know how to build a rectangle using optixAabb.

All custom geometric primitives are defined by an AABB per primitive you calculate and the intersection program you implement for them.

Means you need to have a function which calculates the AABB over your four vertices defining your rectangle primitive and give that resulting array of AABBs per primitive as OptixBuildInputCustomPrimitiveArray to the optixAccelBuild() function as you already found inside the programming guide below.

The primitive index you get inside your OptiX device program when hitting any of the AABBs is the same as the index of the AABB inside that build-input array.

Follow are from the programming guide, but I still don’t know how to define rectangles for d_aabbBuffer. Can you provide an example?

The calculation of the AABB for for point based primitives is super easy. You only need to find the minimum and maximum x-, y-, z -components of all your positions per primitive. That’s all.

The OptiX SDK example optixVolumeViewer builds AABBs for boxes. It’s always the same method.

You can implement that on the host or with a native CUDA kernel on the device if it needs to be much faster.
In the end, the d_aabbBuffer must be on the device, so if you calculated it on the host you need to copy it from host to device with a cudaMemCpy() or cuMemcpyHtoD() depending on which CUDA API you use (runtime or driver API).

In the old OptiX API before version 7.0.0 you needed to specify a “bound box” program for that and before there where build-in triangles (since OptiX 6.0.0) you needed that for triangles as well. Here’s an example in one of my old OptiX 5.1. based examples:
https://github.com/nvpro-samples/optix_advanced_samples/blob/master/src/optixIntroduction/optixIntro_03/shaders/boundingbox_triangle_indexed.cu
For indexed rectangles that would obviously be the same routine just with four vertices. Port that code to the host and adjust it to your rectangle primitive definition.

The more complicated problem is how to implement an intersection program for rectangles.
If they can be arbitrary and not even planar, that is going to become interesting. If they are special cases like a parallelogram, there are examples for that inside the older OptiX SDKs.

With all that said, I seriously recommend using indexed triangles to represent your rectangles, simply for performance reasons and because you don’t need to implement a ray-rectangle intersection routine.

(The OptiX 7 SDK’s optixPathTracer example defines the Cornell Box with rectangles but then uses two independent triangles (not indexed) to build the acceleration structures.
The one in the older OptiX SDKs used a parallelogram primitive for the area light, but that is defined by an anchor point and two edge vectors.)

Thanks for all the answers. It is very useful.
1.I knew built-in geometric primitives in OptiX are is triangles, that was a typo …sorry. That is why I asked the question performance vs memory trade-off.
2. So conclude with your answer, using triangle will be more optimized in terms of performance with or without RT cores. But the AS size of 2 triangles will be larger than one custom primitives for a rectangle? Can I say the memory usage is double ?
3. Is there an example for indexed triangles? Follows are a simple rectangle I learn from the example in optixPathTracer
How can I turn it into an indexed triangle?
const std::array<float3, 6> vertices =
{ {
{ -1.0f, 1.0f, 0.0f },
{ -1.0f, -1.0f, 0.0f },
{ 1.0f, 1.0f, 0.0f },
{ -1.0f, -1.0f, 0.0f },
{ 1.0f, 1.0f, 0.0f },
{ 1.0f, -1.0f, 0.0f }
} };

1).I knew built-in geometric primitives in OptiX are is triangles, that was a typo …sorry.

Focus! :-)

  1. So conclude with your answer, using triangle will be more optimized in terms of performance with or without RT cores. But the AS size of 2 triangles will be larger than one custom primitives for a rectangle? Can I say the memory usage is double ?

I don’t think so, but I haven’t made that experiment. If you compact the GAS, there can be a lot of savings which depends on the size and spatial structure of the primitives. RTX devices compact better. YMMV.

  1. Is there an example for indexed triangles? Follows are a simple rectangle I learn from the example in optixPathTracer
    How can I turn it into an indexed triangle?

You would simply convert the hardcoded geometry data from independent triangles (the 6 vertices) to the four corners of the rectangle (use counter-clockwise winding) and then generate an index buffer which references into vertex pool and the indices should be ordered with (0, 1, 2) and (2, 3, 0) to build two triangles from a rectangle.

const std::array<float3, 4> rectangle_vertices =
{ {
  { -1.0f, -1.0f, 0.0f }, // starting at bottom left and going counter-clockwise for front faces
  {  1.0f, -1.0f, 0.0f },
  {  1.0f,  1.0f, 0.0f },
  { -1.0f,  1.0f, 0.0f }
} };

const std::array<unsigned int, 6> rectangle_indices =
{
  { 0, 1, 2,  // First triangle primitive.
    2, 3, 0 } // Second triangle primitive.
};

// If you have more vertices in that pool, build the triangle indices with the according base vertex index offset.
// That rectangle lies in the xy-plane, means the geometric and shading normal on that is the positive z-axis (0.0f, 0.0f, 1.0f) for both triangles.

All my OptiX 7 examples are using indexed triangles https://github.com/NVIDIA/OptiX_Apps

The simplest is the runtime generation of a cube with 12 triangles which does exactly what I described above and generated the indices at the end:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Box.cpp

Just that my VertexAttributes are holding vertex position, tangent normal and texcoord attributes. You can ignore the ones you don’t need. Concentrate on the lines assigning attrib.vertex.
Also look at the Plane.cpp, Sphere.cpp, and Torus.cpp files for more runtime generated geometries.

Code which builds a geometry acceleration structure from that host data:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Application.cpp#L1331

With compaction in one of the more advanced examples (also using a simple arena allocator):
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/nvlink_shared/src/Device.cpp#L1189