Multiple pipelines and shared instancing

Hi,

I have multiple pipelines each with their own SBT where none of the shaders are common between them, but the geometry that they are all running on is the same. I am trying to set up a 2 level IAS → GAS structure, but am struggling on how this fits in.

If I understand correctly, each OptixInstance needs the SBT offset which could be different for each of my SBTs.

Does this mean the only way is to have a separate context/structure for each pipeline? I am using Optix 7.2.

Thanks!

1 Like

If I understand correctly, each OptixInstance needs the SBT offset which could be different for each of my SBTs.

The question is then, what is the layout of your SBT and how can you change it to not have this problem.

The only problematic case is when the SBTs have different numbers of hit records where the instance sbtOffset selects the shader. But there are many other possible SBT layouts.

There are three ways to solve this:

  1. Pragmatic: Build a different IAS for each SBT to have the sbtOffsets match to your resp. SBT layouts. The GAS are all shared so this isn’t too bad.
  2. If possible make the SBTs all the same size and layout by duplicating the shader headers and potentially other SBT record data to match the sbtOffsets of the biggest SBT.
  3. Change the SBT to have one entry per instance by assigning the shader header and additional record data for each instance with its unique sbtOffset. That allows to switch SBTs or freely exchange shaders and data per instance without rebuilding the IAS. (I’m doing that in my OptiX 7 examples. Links in the sticky posts.)

There would be even more ways, like folding all shaders into one pipeline and setting up the SBT with the necessary strides to use same offsets inside the instance but different raygen programs and each using it’s own “ray type” offset in the optixTrace. Though there can only be one ray generation program inside the SBT and switching the pointer or shader header means a synchronization.
There are also performance considerations to keep in mind:
https://forums.developer.nvidia.com/t/how-to-handle-multiple-ray-generators/83446
https://forums.developer.nvidia.com/t/multiple-raygen-functions-within-same-pipeline-in-optix-7/122305
I would keep different pipelines and SBTs if the shaders are completely different.

There are three ways to solve this:

  1. Pragmatic: Build a different IAS for each SBT to have the sbtOffsets match to your resp. SBT layouts. The GAS are all shared so this isn’t too bad.
  2. If possible make the SBTs all the same size and layout by duplicating the shader headers and potentially other SBT record data to match the sbtOffsets of the biggest SBT.
  3. Change the SBT to have one entry per instance by assigning the shader header and additional record data for each instance with its unique sbtOffset. That allows to switch SBTs or freely exchange shaders and data per instance without rebuilding the IAS. (I’m doing that in my OptiX 7 examples. Links in the sticky posts.)

For #1, this would also require having separate contexts for each IAS to associate with the different launches right?

For #2, is there a performance cost associated with essentially padding the SBTs to all have the same size? Basically, for one of the pipelines I have 2 ray types and the other ones have 1 ray type.

For #3, if I understand the example correctly, this would require updating the SBT before each launch?

I think keeping the pipelines and SBTs separate is cleaner since they really share nothing other than the geometry (and some material parameters).

Sorry for jumping in.
This is absolutely interesting topic.

I have been making an OptiX wrapper library and recently thinking about multiple pipelines but with shared scene geometry.
My library aims to decoupling pipeline and scene as much as possible but have not reached final conclusion.
Currently I take the approach #2.
So all the kernels from different pipelines need to use the same maximum number of ray types as SBTstride argument. I don’t think this as the way it should be.

Thanks for interesting topic :)

For #1, this would also require having separate contexts for each IAS to associate with the different launches right?

Not sure why you think that. An OptiX 7 context controls a single GPU. (The first chapter in the Programming Guide explains that.) Means there is no need to have multiple OptiX contexts on one GPU at all.
You only need multiple CUDA and OptiX contexts in an application when targeting multiple GPU devices. OptiX 7 knows nothing about multi-GPU, that is all managed with CUDA host code. My examples show how.
An IAS is just a 64-bit handle and a buffer with the AS data. You can have as many IAS in an OptiX 7 context as fit into your VRAM.

For #2, is there a performance cost associated with essentially padding the SBTs to all have the same size? Basically, for one of the pipelines I have 2 ray types and the other ones have 1 ray type.

I have not benchmarked that specifically. I wouldn’t expect a performance difference due to the number of SBT entries alone. Think of SBTs as jump tables, indexing into them is not expensive.
What makes a difference is the pipeline when adding more shaders.
Please read all posts I linked to. Those explain potential differences and synchronization points which affect performance.

With a pipeline using one and a pipeline using two ray types I see no problem having two different SBTs.
Mind that sbtOffsets are handled with a stride, so having one, two, or three ray types combined in an SBT wouldn’t necessarily mean different sbtOffsets in the instance.

For #3, if I understand the example correctly, this would require updating the SBT before each launch?

No, that would be too costly. I was still thinking about having different SBTs for the two pipelines.
My point was to make the sbtOffsets inside the IAS constant. That was one of the main questions about your SBT layout. If the sbtOffsets are not changing between SBTs, then you can use the same IAS for multiple pipelines and SBTs (or build the SBT to fit.)
But if you need completely different sbtOffsets, like one pipeline has multiple hit records for different materials and the other only one (e.g. ambient occlusion would only need one shader for all) then you would have completely different SBT layouts and sbtOffset.

Still all three solutions would be possible then. There are even more possibilities if the behavior of the ray types could be folded into shaders as special cases. Means there might not be a need for the second pipeline and SBT at all, at the cost of deciding between these cases at runtime depending on some per-ray flag. There is not enough information to say what other options exist in your case.

I think keeping the pipelines and SBTs separate is cleaner since they really share nothing other than the geometry (and some material parameters).

Yes, if the instance sbtOffsets are the same, having different pipelines and SBTs should be the most efficient case. Note that optixLaunch calls are asynchrounous and this would require no synchronization between launches.

Currently I take the approach #2.
So all the kernels from different pipelines need to use the same maximum number of ray types as SBTstride argument. I don’t think this as the way it should be.

If the IAS do not change, having different pipelines and SBTs would decouple things even more.
That would simplify adding other pipelines (e.g. different light transport algorithms).
You would just need to do the same creation and update mechanisms per pipeline and SBT instead.
Again, also read the links I posted about. The differences in pipelines have been discussed before.

Thanks for taking the time to explain, it is helpful. I’m not sure why I thought I needed multiple contexts, but I see how I could use multiple IAS easily.

I think I’m still struggling to understand what you mean here:

No, that would be too costly. I was still thinking about having different SBTs for the two pipelines.
My point was to make the sbtOffsets inside the IAS constant. That was one of the main questions about your SBT layout. If the sbtOffsets are not changing between SBTs, then you can use the same IAS for multiple pipelines and SBTs (or build the SBT to fit.)
But if you need completely different sbtOffsets, like one pipeline has multiple hit records for different materials and the other only one (e.g. ambient occlusion would only need one shader for all) then you would have completely different SBT layouts and sbtOffset.

Still all three solutions would be possible then. There are even more possibilities if the behavior of the ray types could be folded into shaders as special cases. Means there might not be a need for the second pipeline and SBT at all, at the cost of deciding between these cases at runtime depending on some per-ray flag. There is not enough information to say what other options exist in your case.

In my particular case, I have a camera with 2 ray types (radiance and shadow) and another type of sensor that only has one ray type (which has a single closest hit shader and a miss shader). I think I’m confused how the sbtOffset can be constant for the IAS while still working for both of these SBTs.

I suppose that the shadow rays don’t technically need a shader? In that case, I guess the hit groups would actually be the same between both sensors and then the offset would be the same?

I think I’m still struggling to understand what you mean here

The “one SBT entry per instance” case, is demonstrated in my OptiX 7 examples here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/nvlink_shared/src/Device.cpp#L1329
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/nvlink_shared/src/Device.cpp#L1392
That generates a hit record per instance per ray type which means the sbtOffsets inside the IAS are unique per instance and that means I can change the SBT contents like shader header and material index per instance at will without rebuilding the IAS.
That’s used inside the other examples there to exchange anyhit programs when the material toggles between cutout opacity.
In hindsight I shouldn’t have implemented cutout opacity because that makes everything slower in comparison to handling only opaque materials which don’t need anyhit programs at all.

In my particular case, I have a camera with 2 ray types (radiance and shadow) and another type of sensor that only has one ray type (which has a single closest hit shader and a miss shader). I think I’m confused how the sbtOffset can be constant for the IAS while still working for both of these SBTs.

Then you haven’t understood how the SBT index is calculated.
Please read this chapter inside the OptiX 7 programming guide:
https://raytracing-docs.nvidia.com/optix7/guide/index.html#shader_binding_table#shader-binding-table

The crucial formula calculating the effective SBT index per ray is in chapter 7.3.

It’s actually straightforward and I can’t explain it simpler than this:

You have an SBT layout for the pipeline with two ray types, where the instance sbtOffsets define the base of the hit record and the optixTrace call selects the SBToffset for the ray type (0 for radiance, 1 for shadow ray) with SBTstride set to 2 for the number of ray types.
(Listing 7.3 and 7.4 inside the SBT chapter of the programming guide show exactly that.)
The missSbtIndex is normally also selected via the ray type, but could be independent.

Then you can build a second SBT for the pipeline with only one ray type which uses the same instance sbtOffsets and now the optixTrace call just always sets the SBToffset to 0 and the SBTstride to 1 because there is only one ray type.
The number of hit records would need to match the SBT layout of the other pipeline.
In that case the instance sbtOffsets select the hit record the exact same way in both pipelines.

Easy. Done. (That is method 2. of the initial reply.)

Now, if you have a ray type which doesn’t need a hit record at all, then the ray flags inside the optixTrace call can be set to never even reach the hit programs, like this for a shadow/visibility ray:
rayFlags = OPTIX_RAY_FLAG_DISABLE_ANYHIT | OPTIX_RAY_FLAG_DISABLE_CLOSESTHIT | OPTIX_RAY_FLAG_TERMINATE_ON_FIRST_HIT;
Docs here: https://raytracing-docs.nvidia.com/optix7/api/html/group__optix__types.html#ga50e4054620bfddb9b3e282d1a53e211b

If there is no hit, the miss shader is called and can set the visibility payload to true.
That way the RTX hardware traversal is not interrupted and this is the fastest method for visibility tests for opaque geometry.
Mind that for curves and custom geometric primitives there would still need to be the hit record entries inside the SBT since these also contain the intersection shader, just that there wouldn’t be a need for anyhit and closest hit shaders in that case.
With cutout opacity materials where some parts of the geometry are not intersected (e.g. based on some alpha texture), this won’t be possible and needs to be done with anyhit programs which can ignore intersections dynamically.