Is Optix 7 benificial from manually sorted rays?

yetsun · January 10, 2020, 5:36am

For the issues about coherence efficiency, I want to manually sort rays before calling optixLaunch(), but I don’t know whether OptiX itself reorder rays automatically and thus would disorganize my well-sorted rays.

An example case is that I want to gather all secondary rays after 1 pass path tracing, then sort these rays and call optixLaunch() again.

droettger · January 10, 2020, 9:50am

In general it is beneficial for BVH traversal performance to have convergent ray loads.

The crucial question is, how much do you save during traversal with sorted rays?
That would be the maximum time you have available for sorting just to break even.

Now that time got a lot shorter with the RT cores on Turing because the hardware BVH traversal is really fast.
The divergence in shading different surface hits is also a factor in that.
Still, divergent rays will access more memory. On the other hand, manually sorting rays will add more work and memory accesses between launches.
Means all this is GPU and scene dependent and would need to be measured individually.

As a start, maybe just partition rays into the eight buckets of the ray direction octants defined by the components’ signs.

The single ray programming model in OptiX is meant to allow internal scheduling to be changed.
What you describe is effectively the wavefront tracing used in the OptiX Prime API which was discontinued in OptiX 7. (There exists an optixRaycasting SDK example instead.)
With a maximum of 10 GRays/s on the high-end Turing GPUs, alone the reading and writing of the ray queries and hit records would run into memory limits (you could only read around 64 bytes per ray at that rate) and then you haven’t done any shading processing, yet.
It should be more efficient to keep the RT cores running for multiple ray segments using the built-in continuation mechanisms provided by the OptiX program domains.

yetsun · January 10, 2020, 9:58am

Indeed the overhead of ray sorting would be heigh and simple ray binning may be a good balance.

Thanks a lot for your answer.

Topic		Replies	Views
Wraps in Ray gen and how data is initially stored in the memory hierarchy OptiX	13	1223	June 14, 2022
Task scheduling in OptiX 7 OptiX	6	1333	October 12, 2021
How is the performance of the huge amount of rays in this small scene? OptiX cuda , optix	5	99	January 10, 2026
Optix scheduling / dynamic memory management? OptiX	6	1293	June 14, 2022
OptiX 8: optixTrace() vs optixTraverse()+optixInvoke() performance OptiX	3	834	April 12, 2024
Tracing Non-Homogenous Unstructured Volumes OptiX	4	697	November 6, 2023
Some questions about ray OptiX	10	1996	May 12, 2023
Newbie OptiX question(s) OptiX	11	1444	June 14, 2022
Understanding optixParticleVolumes OptiX	13	1259	June 14, 2022
RTX triangles performance, any tips? OptiX	9	2127	June 14, 2022

Is Optix 7 benificial from manually sorted rays?

Related topics