In my project I have some “thin glass” objects (e.g. window’s glass) in this case I don’t want to cast another Ray (refract) but I want to traverse the intersections manually in order to compute the filtered color.
I want to use the callback “Any hit radiance” and I want to store the intermediate intersections somewhere. Within the function “closest_hit_radiance” I examine the material, if it is “thin glass” or IOR ~ 1.0 I’d like to sort the intersections in order to locate the nearest opaque object, then I compute the glass filtered color.
I hope to limit the call stack size (e.g. cast another ray when not strictly necessary). I have some problem here (reflections, refractions, …) => many rays.
Here is the question: Which is the “best way” to store intermediate intersections (stack size, memory footprint, speed, …) and fastest way to sort them.
At first you say you’re using any hit. Then you say you’re using closest hit. Are they for the same material? And if you’d like to get in order the nearest opaque object, a closest hit program will do this for you. There’s no need to “sort the intersections in order” if you use a closest hit.
If you want to limit the call stack size and relauch rays, you can do so by emitting from the original entry point kernel. That is, instead of launching another ray from the closet_hit program, store the intersection point and direction in the ray payload. When it returns from the closest_hit program back to the original entry point kernel, use the information in the payload to set up and launch a new ray. This keeps stack use to a minimum. I think this is what you’re trying to do?
I’m not sure if there is a real “maximum”, but I really don’t know and have never tested it. Personally I try to keep it as small as possible. If it’s too big it will likely degrade performance.
Storing the information of a single intersection doesn’t take much memory. It can be overwritten in subsequent intersections after relaunching. Doing the iterative launching instead of recursive launching (like I mentioned in my previous post) will save A LOT of stack memory.
I just want to chime in to second over0219’s suggestion of storing intersection points in the ray payload, with iterative ray casting in the original calling kernel. This is what the OptiX team generally recommends to customers.
The material object question answers itself when looking at which objects are valid inside
rtVariableSetObject(RTvariable v, RTobject object): “The concrete type of object can be one of RTbuffer, RTtexturesampler, RTgroup, RTprogram, RTselector, RTgeometrygroup, or RTtransform.”
Means you do not have access to the RTmaterial object inside the programs at all.
Your idea of storing data inside all hits during the anyhit program has the flaw that you wouldn’t know which of the hits is the one which would skip the call to rtIgnoreIntersection to actually reach the closest hit program.
Try the iterative approach like in a path tracer to handle that ordering automatically.
The limit is your graphics board memory. It’s feeding on the stack size along with all other data which needs to be stored around recursions.
Yes, Yes, (and Yes).
The smaller the stack the more memory you have for other things. This is per thread so this gets big on mighty boards. Also smaller stack sizes might run faster. Prefer iterative algorithms over recursive to reduce the required stack size.