SBT Theoretical quesions?

Hello all,

I have a couple of questions regarding the SBT in OptiX which may be somewhat theoretical in nature I hope someone has an answer to.

  1. Is an SBT required for every program (shader) you write? Say I have a __anyhit__ shader that will not even use an SBT, is it still required that I create a SBT for it? A sort of “dummy” SBT.
  2. How can I combine multiple SBT as part of a hit group that contain different numbers of elements? For example, if I have an SBT defined for a __closesthit__ program and another SBT for __anyhit__ whereby both are built using the same underlying struct but have different number of elements. As follows:

struct CHData {
float3 result;
int cnt;
};
struct AHData {
float3 result;
}
template
struct SbtRecord {
__align__( OPTIX_SBT_RECORD_ALIGNMENT )
char header[OPTIX_SBT_RECORD_HEADER_SIZE];
T *data;
};
typedef SbtRecord SbtCH;
typedef SbtRecord SbtAH;

SbtCH ch_sbt;
ch_sbt.data.result = { 0.0f, 0.0f, 0.0f };
ch_sbt.data.cnt = 1;

SbtAH ah_sbt;
ah_sbt.data.result = {0.0f, 0.0f, 0.0f);

How could I combine SbtAH and SbtCH as part of hit group record when building SBT? Or is this even possible?

Thank you in advance for any help.

1 Like

Make sure you do not mix up the terminology.

The Shader Binding Table (SBT) contains SBT Records which define the connection between your scene data and shader invocations for any rays.

Say I have a __anyhit__ shader that will not even use an SBT, is it still required that I create a SBT for it? A sort of “dummy” SBT.
How can I combine multiple SBT as part of a hit group that contain different numbers of elements? For example, if I have an SBT defined for a __closesthit__ program and another SBT for __anyhit__ whereby both are built using the same underlying struct but have different number of elements.

Anyhit shaders are not an isolated thing but part of a hit record entry which consists of intersection, anyhit and closesthit shaders. Means there cannot be differently sized SBT record struct assigned to anyhit and closesthit shaders for the same hit record in an SBT.

(Though not all three shaders are required, e.g. a hit record assigned to built-in triangle geometry doesn’t have a user defined intersection shader. Or opaque geometry without cutout opacity does not need an anyhit program. Visibility rays on such opaque objects do not need either anyhit or closesthit programs but can handle the test with a miss shader.
All unused shaders have nullptr values for the module and entryFunctionName in their resp. OptixProgramGroupDesc field.)

The type of the SBT record structure must be unique for each individual of the five OptixShaderBindingTable record field since that defines the stride of the SBT Records in that array.
Means even if you do not source your SbtRecord struct T data field with optixGetSbtDataPointer() on device side, the space for that needs to be present in the arrays of SbtRecord.
How you interpret the pointer returned by optixGetSbtDataPointer() is your decision. The data field will be the same for closesthit and anyhit though.
(You should also make sure you can never source uninitialized data from there.)

Is an SBT record data field required for every program (shader) you write?

No. You can have different SBT Record structures in the five different OptixShaderBindingTable record fields.
You need to have room for SBT records in the SBT for all entries to make sure the SBT index calculation in this formula works for each ray! And for all SBT records which can actually be called, there must be the correct header field in the SBT record because it identifies the shaders to call, even if it’s a null shader.

For example my applications have no data on non-hit records, because ray generation and miss shaders can source their data directly from the single launch parameters structure.
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/nvlink_shared/inc/Device.h#L183

Have a look at the very simple optixTriangle SDK example which shows something like that as well. (No data on the raygen and hit record and a color on the miss record.)

Thank you @droettger for the response.

There is a lot of information to take in - truly the SBT must be the most confusing part of OptiX.

So, if I want to store the SbtCH and SbtAH into a CUdeviceptr hitRecord (as part of a single hit group record) I can’t do that as the code snippet shows?

Sorry if this is a dumb statement/question but I am trying to learn.

Thanks again

Yes, the SBT is a flexible thing.

Oh, I didn’t realize that you actually missed the template typename and arguments in the code and data is not a pointer but the struct itself.
It should have looked like this:

template<typename T>
struct SbtRecord {
  __align__( OPTIX_SBT_RECORD_ALIGNMENT ) char header[OPTIX_SBT_RECORD_HEADER_SIZE];
  T data;
};

typedef SbtRecord<CHData> SbtCH;
typedef SbtRecord<AHData> SbtAH;

but as explained, differentiating this for closesthit and anyhit programs doesn’t make sense because they are both in the same SBT hit group records.

Means simply combine the two which means you can use the CHData as data and if only the closesthit program touches the cnt field, then so be it.

struct HitData {
  float3 result; // Probably not the best name for an input value. :-)
  int cnt;
};

template<typename T>
struct SbtRecord {
  __align__( OPTIX_SBT_RECORD_ALIGNMENT ) char header[OPTIX_SBT_RECORD_HEADER_SIZE];
  T data;
};

typedef SbtRecord<HitData> SbtRecordHitData;

and then you can allocate a buffer of these on the device getting a CUdeviceptr, copy the necessary data from host to device into the header and data fields, and assign the CUdeviceptr to the OptixShaderBindingTable hitRecord field, set the stride and count correctly, and that’s it.

(All the green text in the posts are links. Follow them for more information.)

2 Likes

Thank you @droettger for the information, much appreciated.

So if I had multiple hit “types”, say instead of just closest hit or any hit I also have an __intersection__it might be best to combine them into a single type of SBT to store in a CUdeviceptr and let DEVICE side code (shader) define how to use the SBT.

Thanks again and sorry for the questions on a holiday weekend.

No holiday on my side of the Atlantic. :-)

So if I had multiple hit “types”, say instead of just closest hit or any hit I also have an __intersection__ it might be best to combine them into a single type of SBT to store in a CUdeviceptr and let DEVICE side code (shader) define how to use the SBT.

That’s the whole point. Intersection, anyhit, and closesthit programs together form one OptixProgramHitGroup.

There are these five OptixShaderBindingTable record array fields, namely:
raygenRecord (single entry),
exceptionRecord (single entry),
missRecordBase (one per ray type),
hitgroupRecordBase (as many as you need to express your desired mapping of scene graph elements to hit groups times ray types),
callablesRecordBase (one per direct or continuation callable).

Means the hitgroupRecordBase holds a CUdeviceptr to an array of SBT records with hitgroupRecordCount many elements and hitgroupRecordStrideInBytes stride per element which is a multiple of OPTIX_SBT_RECORD_ALIGNMENT (16 bytes) and at least OPTIX_SBT_RECORD_HEADER_SIZE (32 bytes) big.

That means all your SBT hit records must fit into that stride, which means the biggest hit record defines that whole array layout. Think of it as a union of structures.

Now OptiX doesn’t care how you interpret the pointer to the data field in an SBT record returned by optixGetSbtDataPointer() means each hit group record can interpret that differently if you like.
Since the SBT record header always has the same size, the pointer to the data field is always at the same offset in the SBT record structure.
Means if you really have different hit record structures for different OptixProgramHitGroup (== different combinations of intersection+anyhit+closesthit programs) then you can do that by carefully filling in the resp. data fields and correctly interpreting the returned optixGetSbtDataPointer().
Mind that doing this requires matching anyhit and closesthit program behavior.

Even when having different geometric primitives, I have not required different hit record structures in my applications.
E.g. lets say when using built-in triangles and built-in curves. I store the vertex attributes and indices array pointers in the hit group record, plus some index or pointer to the material parameters. So while the closesthit programs for these work vastly different, the hit record had exactly the same structure. Keeping it simple.

But if, for example, the material parameters should be part of the SBT hit record, then that would be similar to your case with rather different parameters (data structure) for triangle and curve (hair) shaders. Again, the bigger one defines the hit record stride.

2 Likes

It makes perfect sense thinking of it as a union of structs :). You define how the data variable is employed.

So even if I have different parameters for different hit (e.g. an intersection versus a closest hit) the larger one defines the record stride - I get it now.

Thank you a million times. If I am correct, I see it clearly now.

So even if I have different parameters for different hit (e.g. an intersection versus a closest hit) the larger one defines the record stride - I get it now.

Again, please drop that idea of having different hit record data structures for the intersection, anyhit, and closesthit programs.
These three belong together into a single OptixProgramGroup and consequently use the exact same hit record data structure, of which not all programs necessarily need to read all fields.

For example, the intersection program might only need to read the primitive indices and vertex positions to calculate the intersection, while the closesthit program reads these as well to calculate the final shading attributes, but normally also reads additional data in that hit record like material parameters for example, which are irrelevant for the intersection.

Only different combinations of those three program types IS+AH+CH can have different hit records, like in my example, a hit record for built-in triangle geometry (which do not have a user defined intersection program) and a different hit record for curve primitives (which have a pre-defined intersection program you need to put into the OptixProgramGroup).

Means the union (== maximum size of all SBT hit record types used in a pipeline) of these two completely different sets of IS+AH+CH programs defines the stride of the hit record groups array in the hitgroupRecordBase of the OptixShaderBindingTable. That’s all.

Got it. It is all very clear to me now.

I appreciate your patience with my understanding. Thank you again for all you assistance.

No problem. The shader binding table is a central topic which must be understood or nothing will work.

Note that when I talked about reading the vertex attributes and indices data from an SBT hit record above, I was thinking of an SBT with a hit record per OptixInstance.
Without an instance acceleration structure (IAS), means only using a single geometry acceleration structure (GAS) as the scene, things will be a lot less flexible and the SBT hit records can naturally be used to store the required data.

I use an IAS->GAS scene structure in my OptiX 7 examples and an SBT entry per instance because that allowed changing shaders (SBT hit record header) per instance without updating the IAS because nothing changed in the OptixInstances.
This is kind of wasteful when having few materials and many instances (e.g. millions).
I would recommend implementing that differently today to simplify the SBT handling.

There is another elegant method to index into the SBT for the material and hold the per instance data separately.
The SBT hit records only need to contain the 32 byte header information which selects the material programs defining the material behavior (bi-directional scattering distribution function, BSDF), but none of the input parameters.
Which material is used can be directly selected with the OptixInstance sbtOffset value.

All other data required to define an OptixInstance’s geometry topology (vertex attributes and indices) and any other data like material parameters can be stored in separate device memory arrays and uniquely indexed by the user defined OptixInstance instanceId field which can be read inside the device code with optixGetInstanceId() which is available in IS, AH, and CH programs.
Note that there is also the optixGetInstanceIndex() which returns the zero based index inside an IAS. Means when the scene is using a single IAS as the root (implies OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING) that would be another unique per instance index which could be used to access custom data in device memory.

Now, changing material parameters of such an instance would only need to update the material parameter buffer of that instance.
(That’s normally done by copying data from host to device with a CUDA host functions, but that could even be done in a kernel on the device since it’s just some device memory pointer. Just in case you’d ever need to animate a huge number of material parameters programmatically.)

Switching the instance’s material shader (BSDF) would need to set a different sbtOffset and update the IAS with optixAccelBuild(). Though that’s a pretty fast operation.

That would provide the smallest possible shader binding table while having all remaining flexibilty offered by OptixInstances.

Think of this example:
A material library has three different shaders for wood: matte, oiled, coated.
But there are many different kinds of wood types (textures) in that material library and each can be used with the matte, oiled, and coated wood shaders.
Means you would only need three hit records in the SBT for the different shaders, but can have arbitrary many material instances with different input parameters when indexing these via the OptixInstance instanceId.

The design of the SBT boils down to the available inputs for the SBT index calculation formula and what an application requires.
https://raytracing-docs.nvidia.com/optix7/guide/index.html#shader_binding_table#acceleration-structures
It’s your choice how to architect that in the application. The SBT is very flexible to allow handling of different application requirements, and we haven’t even talked about multiple SBT entries per GAS.

1 Like

Thanks for the information @droettger

This smallest possible shader binding table architecture is now shown inside the rtigo10 example added to the OptiX 7 Advanced Examples:
https://forums.developer.nvidia.com/t/optix-advanced-samples-on-github/48410/8