If you have error reports from OptiX like that invalid value, please set a log callback function and enable validation mode like this:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/MDL_renderer/src/Device.cpp#L296
That will normally result in better explanations what value is incorrect.
I’m always using __forceinline__ __device__
for all my OptiX device code functions which are not program entry points or callable programs. That way the compiler will do the right thing. __inline__
is not enough.
In your code the getRayDirection()
should do that.
However, I don’t really understand what I should do given the information you gave me regarding piplineCompileOptions.traversableGraphFlags.
OptiX supports more render graph hierarchies than DXR and Vulkan RT.
It allows to trace rays against
- a single geometry acceleration structure (GAS)
as used in many OptiX SDK examples. (None of my examples does that.) - a two-level AS structure with one instance AS (IAS) over many GAS
which is the fastest option on RTX boards and the only structure DXR and Vulkan RT support, - a multi-level structure with more than one IAS above the bottom level GAS.
This is useful if you need to instance whole sub-models, but it’s limited in depth.)
For each of these three cases, OptiX provides a matching traversableGraphFlags value.
In the above order these are:
OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_GAS,
OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING,
OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_ANY.
For performance, it’s recommended to use the IAS->GAS render graph structure (traversableGraphFlags = OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING
), even if you only have a single GAS.
Simply put an IAS with a single OptixInstance with an identity transform above that and the GAS traversable handle as child, then use the IAS traversable handle as argument inside the optixTrace calls.
That simple graph would not need any transforms from object space to world space inside the device code, because with the identity transform object space == world space, but this is the fastest render graph for RTX boards because BVH traversal through that is fully hardware accelerated and when using built-in triangle primitives, ray-triangle intersections as well.
When adding more GAS in the future, you have the proper render graph layout to add more OptixInstances to that top-level IAS easily.
(Example code where I show that inside the intro examples here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Application.cpp#L1596
Later examples use a very simple host-side scene graph with arbitrary depth which is traversed to flatten it to an IAS->GAS render graph.
For the fastest example inside that repository, please look at rtigo12.
All my examples implement a pinhole camera (“lens shaders” are implemented as direct callable program)
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo12/shaders/raygeneration.cu#L336
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo12/shaders/lens_shader.cu#L40
and a runtime generated plane geometry with selectable tessellation:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo12/src/Plane.cpp#L37
Inside the intro examples these are build directly and assigned to an OptixInstance:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Application.cpp#L1600
inside the later examples these are put into the host side scene graph and automatically instanced when the same geometry was built before:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo12/src/Application.cpp#L1842
(If you use code blocks for all posted code (the </>
icon in the toolbar, preformatted text, ctrl+e
) that will preserve the formatting.)