Understanding optixTransformNormalFromObjectToWorldSpace

nkep · March 6, 2024, 4:33pm

Hi,

I am using optix 8.0 and I am trying to figure out if my understanding of the optixTransformNormalFromObjectToWorldSpace util function is correct.

The guide says:

Transforms the normal using object-to-world transformation matrix resulting from the current active transformation list.

So my issue here is the concept of “current transformation list”. I have no idea what that would be when using optixTraverse
I am only interecting triangle meshes, with many BLASes and one TLAS, allowing instancing.
Instance matrices are passed to the instance field of the OptixInstance objects.

It works, for primary rays. However, if I use optixTraverse, then I need to deal with the concept of incoming and outgoing hit objects. Which I do by calling the optixHitObject* family of functions to get the data I want to use when the traverse function returns.

So, my understanding is that there’s no equivalent of optixTransformNormalFromObjectToWorldSpace for “outgoing” hit objects?
If that is the case, what would be the easiest way to do the same thing being sure it uses the correct instance transform, provided that I am still able to fetch the right object-space normal?

I guess it has to do with the idea of current transf. list?

Thanks

droettger · March 7, 2024, 8:46am

The optixTransformNormalFromObjectToWorldSpace helper function is one of the general purpose functions to handle any transform list size OptiX supports (see the Limits chapter inside the OptiX Programing Guide on the maximum traversable depth, it’s 31 today).

These helpers build the effective transform matrices by iterating over the transform list with
optixGetTransformListSize and optixGetTransformListHandle(index) and concatenate the respective transform types in the right order for transforms and their inverse. (This gets a little involved with motion transforms.)
The code doing that is inside the optix_device_impl.h and optix_device_impl_transformations.h headers.

So, my understanding is that there’s no equivalent of optixTransformNormalFromObjectToWorldSpace for “outgoing” hit objects?

Yes, since these are not using the optixGetHitObject variants (optixGetHitObjectTransformListSize and optixGetHitOjectTransformListHandle(index)), they are apparently not working on hit objects.

If you’re only using an IAS->GAS structure, the transform list is exactly one entry and it’s an instance transform type! That simplifies the transformation handling a lot.

Getting the transform of a hit primitive then can use the hardcoded index 0 in optixGetTransformListHandle(0) like here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/MDL_renderer/shaders/hit.cu#L118
and you know exactly that it’s an instance transform handle type, so getting the matrix can use optixGetInstanceTransformFromHandle without checking the type. (Look at what other “TransformFromHandle” device functions exist.) and then use the respective transform functions:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/MDL_renderer/shaders/transform.h
which are effectively the same operations as inside the point, vector, normal transform functions at the end of the OptiX SDK 8.0.0\include\internal\optix_device_impl_transformations.h header, just with different arguments.

I use these because I want to get the transformation matrices only exactly once and use them in multiple transform calls which the helper functions don’t do. (The compiler hopefully optimizes that away.)

However, if I use optixTraverse, then I need to deal with the concept of incoming and outgoing hit objects.

Could you please explain what you’re implementing by using optixTraverse and in what program domains you’re calling it?

After optixTraverse returns there exists an “outgoing” hit/miss record, so if you’re using that to implement Shader Execution Reordering (SER) by calling optixTraverse, optixReorder, optixInvoke inside the ray generation program, there is no need to call any of the optixHitObject functions at all.
Looks like this: https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/MDL_renderer/shaders/raygeneration.cu#L183
or like inside the OptiX SDK optixPathTracer example which also shows how to use optixTraverse for shadow rays.

If you’re replacing the hit records with the optixMakeHitObject functions, you would have retrieved data from some hit object with the optixGetHitObject functions or the the current transform list before from which you would fill into your new hit object.

For the transforms you would need to provide the transform list size and an array of the transform handles inside the optixMakeHitObject transforms and numTransforms arguments.
https://raytracing-docs.nvidia.com/optix8/guide/index.html#shader_execution_reordering#hit-objects

You get the transform list size and handles from the incoming hit hit object with optixHitObjectGetTransformListSize and optixHitObjectGetTransformListHandle(index).

In the case of an IAS->GAS graph, that would be numTransforms = 1 and only one transform handle (for the OptixInstance matrix) in that array.

So if you want to use optixTraverse and hit-objects manually, you would need to implement the transformation routines on that following my simple transform.h version for the IAS->GAS case or the port the more complete helper functions inside optix_device_impl_transformations.h to use hit object functions to retrieve that data.

Again, I’m not sure what your trying to implement and in what program domain.
Do you have a version of the algorithm which works with optixTrace calls?
Some code about what you’re planning would be helpful.

nkep · March 7, 2024, 3:37pm

Hi, and thank you for your answer.

Could you please explain what you’re implementing by using optixTraverse and in what program domains you’re calling it?

It’s an iterative ray/path tracer.

So, from a raygen program, I call optixTrace so that a miss or a hit program is called.
From within the hit program, There’s something like:

do {
//shade based on current_hit
//decide where next ray goes
current_hit = traceRay(  new_ray, SECONDARY_RAY) //that's where optixTraverse is called
//current hit must now contain relevant data extracted from SBT entry of the **outgoing object**.
//this includes **normals** and **barycentrics**.

[...check for miss, increase ray depth].
}while(max_depth is reached);

I hope I clarified.
However, I solved it as you suggested.
In the function I call to fetch the transformed normal, I now specify if that is coming from a primary ray (incoming hit) or secondary ray (outgoing object).

In the first case, I fetch the local space normals, interpolate using optixGetTriangleBarycentrics and then use
optixTransformNormalFromObjectToWorldSpace

In the second case, once I got the SBT pointer I “manually” calculate barycentrics (as I understood that optixGetTriangleBarycentrics also refer to only the incoming hit), fetch local normal, interpolate with barycentrics, then call optixHitObjectGetInstanceIndex() so that I can get the traversable handle via optixHitObjectGetTransformListHandle
and finally, optix_impl::optixTransformNormal(inverse_transform, interpolated_normal);

Visually, and by “debugging” calculated normals and barycentrics, it looks like it’s ok now.

Thanks

droettger · March 7, 2024, 4:54pm

Hmm, that optixTraverse usage is an interesting approach but what are the benefits you see?
Is that faster than a standard iterative unidirectional path tracer running the path loop inside the raygen program?
(Like in the link where I use shader execution reordering.)

nkep · March 7, 2024, 5:30pm

The only benefit you get is that it’s much easier to understand and it maps in a easier way to what you can read on books or other learning resources, (obviously in my very modest opinion/experience).

Performance-wise*, I have never tried to do this in a loop in the raygen.
However, for higher performance (10%-30% faster depedending on the scene) within the same application I also implemented the same path tracer that calls optixLaunch every ray bounce.
In this way, there’s only one optixTrace call in the raygen and no need to do deal with those outgoing objects.
Obviously in the closest hit program, this implies to explicitely set rays as inactive and/or store the next ray for the following traversal, hence making things slightly more difficult to understand (on the POV of someone else reading the code).
But like I said, by doing so it is faster. (It might be due to better handling of warp divergence).
Will try to see if moving the loop inside the raygen will further improve.

What I am building must support both ways, and even mix them. E.g, doing MIS by shooting rays towards lights using optixTraverse while using optixTrace for material based sampling and path tracing.

*I am using a titan-v, so ser results to nop and there’s no dedicated HW for ray-traversals, sooner or later I (hope) will try on a RTX GPU to see what happens to that performance gap.