How to use OptiX trace a child instance

Hello, I tried to modify optix wrapper lib examples and want to only trace one child of the IAS. but failed, only miss shader invoked.

in samples/cmdline/s06-mixedGeometries,
in host code,

  OWLGroup world = owlInstanceGroupCreate(context,2);

it build a IAS and set two Childs, I would like to know, in device code, how could I trace only one of them.

I modified device code the tracePath function,

OptixTraversableHandle child1 = optixGetInstanceTraversableFromIAS(, 1);

  OptixTraversableHandle child = optixGetInstanceChildFromHandle(;
  /* iterative version of recursion, up to depth 50 */
  for (int depth=0;depth<50;depth++) {
    prd.out.scatterEvent = rayDidntHitAnything;
    owl::traceRay(/*accel to trace against*/child1,
                  /*the ray to trace*/ ray,

It dose not work.

Moving this over to the OptiX sub-forum.

Please read these two recent discussions about potential problems when starting optixTrace with other traversable handles than the root IAS:

What exactly is not working?

You don’t get the correct traversable handle you put into the OptixInstance on the host when calling optixGetInstanceTraversableFromIAS?
What are your OptixTraversableGraphFlags?

You don’t get any intersections? Mind that your ray is in world space coordinates inside the ray generation, closesthit and miss programs.
What the world space is, depends on which traversable handle is used inside optixTrace because that defines the transform list.

If you have specific issues with OWL, please keep discussing these inside the github repository with the original author.

thanks a lot for your replying.
and GAS->identity transformation->IAS , the root node has two childs, one is customer spheres, another is triangle formed boxes.

now the problem is when I select one of the child of root AS,
and pass it to optixTrace,
the closet hit function is not triggered.

is there any examples to show how to ray trace a child instance?thank you very much!

I don’t know of an example which does this and I consider it a bad idea.

Why exactly would you need that?

The error in your code was that you used the wrong traversable handle argument in optixGetInstanceChildFromHandle:

OptixTraversableHandle child1 = optixGetInstanceTraversableFromIAS(, 1);
OptixTraversableHandle child  = optixGetInstanceChildFromHandle(; // BUG: This will return 0, and should have been child1 instead of

An optixTrace() call with traversableHandle == 0 argument will immediately call the miss program.

Again, it makes absolutely no sense inside an AS structure with only IAS->GAS for performance reasons because tracing directly against the GAS will be slower.

Here’s some example code how to get the necessary traversable handles in your case and comments about it:

    // DEBUG 
    uint3 theLaunchIndex = optixGetLaunchIndex();
    if (theLaunchIndex.x == 256 && theLaunchIndex.y == 256)
      // 1.) This is a bad idea.
      // Query which instance index we hit in some IAS. 
      // This is the index inside the bottom-most(!) IAS over the GAS.
      // Means you wouldn't actually know what IAS you're at inside a deeper hierarchy, unless you stored the instance traversable handle for some unique path to a GAS to be able to look up its children.
      // Actually the optixGetTransformListHandle() and optixGetTransformListSize() minus 1 and 2 can be used to query instances and IAS handles upwards.
      // Another issue is that you could reuse GAS under different instances and with different materials. 
      // So if this is, for example, required to do sub-surface scattering only inside the GAS, then the material assignment cannot be at instance level.
      // Also when adding scene lighting into the mix, you would need the current matrices at the GAS to be able to transform the ray back into the light geometry's world space again,
      // but only if you're in that local GAS system, so that condition also needs to be tracked inside the per ray payload.
      // In a IAS->GAS scene structure the IAS is always the topObject and the children are the GAS traversables.
      // This is a value in the range [0, m_instances.size()-1] in this example.
      const unsigned int     index    = optixGetInstanceIndex();
      OptixTraversableHandle instance = optixGetInstanceTraversableFromIAS(sysParameter.topObject, index);
      OptixTraversableHandle child    = optixGetInstanceChildFromHandle(instance);

      // 2.) What you actually want is probably just the GAS.
      // This is identical to "child" if the structure is IAS->GAS, but then this whole trace against a lower level AS doesn't make any sense for performance reasons.
      OptixTraversableHandle gas = optixGetGASTraversableHandle();

      printf("index %d: instance = %llu, child = %llu, gas = %llu\n", index, instance, child, gas);

Then you would also need to store the two transform matrices onto the per ray payload to be able to transform the ray into the world space of the GAS inside the raygeneration program and then track that you’re not tracing against the top-level traversable handle, or use a different ray type to make sure you’re not recursing into that case again.
If this is about volume effects, you would need to be careful about scaling inside the instance matrices because that would change the world space inside which volume scattering and absorption coefficients are usually defined.

  // Put this into the PRD somehow.
  // You wouldn't need this if the instances all use the identity transform, which would mean the GAS data is actually in "world space".
  float4 objectToWorld[3];
  float4 worldToObject[3];

  // These functions getting the current transform matrix and its inverse work for any transform hierarchy.
  optix_impl::optixGetObjectToWorldTransformMatrix(objectToWorld[0], objectToWorld[1], objectToWorld[2]);
  optix_impl::optixGetWorldToObjectTransformMatrix(worldToObject[0], worldToObject[1], worldToObject[2]);

1 Like

hello,droettger. Thanks a lot for your replying and your code. your replying help me a lot.
I would like to explain you why I try to do this job. Yes, in CG area, it makes no sense to trace a child AS. However, I want to migrate OptiX to radiation therapy simulation. In one scene of radiation therapy we called pencil beams simulation, there are about 10-20 thousands beams in one treatment plan, the whole target scene is the patient CT voxels, but particles from one beam will only influence very few voxels around the direction of this beam(like a cylinder). I just think if I build a IAS from all CT voxels, in each optixTrace, the particles will traverse a lot of useless voxels, the N of log(N) is huge. So, I just think, if i can use cylinders to filter those useless voxels for each beam during build scene period, and each cylinder as a child instance of the root, then in rayGen shader, I can only pass the child node handle according to their beam ID. This way maybe could reduce the log(N).
Finally, thank you again for your help!

Another problem with tracing against a GAS in a deeper hierarchy is that the OptixInstance sbtOffset isn’t available to the SBT index calculation formula anymore and that cannot be compensated for with the optixTrace sbtOffset because that is only 4 bits wide and meant to implement up to 16 different raytypes, which you normally never need because you can also add different behaviors into hit programs.
So you would be limited in how you design your shader binding table when doing that.

Have you implemented your algorithm using only the top-level IAS traversable handle inside the optixTrace calls and benchmarked it?
Because if not, you’re doing premature optimization.

If your scene hierarchy is only IAS->GAS that is the case for OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING and traversal through that is fully hardware accelerated by the RT cores on RTX boards. No other scene hierarchy will reach the same number of rays/second!

The hardware BVH traversal inside the RT cores is meant to find only the AABBs of the geometric primitives the ray intersects with and the ray interval [tmin, tmax] gets smaller by each intersection. It’s really fast.

Are these 10 to 20 thousand beams also the number of rays you’re shooting per optixLaunch, or is each beam represented by many rays?
Because 20,000 rays wouldn’t even saturate current GPUs.

Yes, I have already implemented my algorithm using only the top-level IAS traversable handle, and OWL already helps me set the OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING flag by default.
There are about 10-20thousands beams in one treatment plan, and in each beam, there are about 1-20thousands particles(photon, proton or heavy ions) emitted by nozzle.
In my simulation, a particle represented by a ray.
Further more, I have benchmarked top-level IAS traversal already. If I reduce CT voxels, the simulation speed will become much faster!
I will try your code and suggestions, it may get a significant improvement in radiation therapy simulation!

If I reduce CT voxels, the simulation speed will become much faster!

OK, so if I understand that correctly, your CT voxels are your geometric primitives and reducing their number increases the raytracing performance. Sounds reasonable.

Could you explain a little more what the geometric primitives actually are, how many of them there are and their spatial relationship?
Asking to avoid slow AS configurations like this:

Do you put each voxel under an individual OptixInstance?
If yes, could you combine them into fewer or a single GAS instead?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.