What would be the best approach: continue ray only if it hits an object.

So I was wondering what the best/fastest approach here would be.

I shoot a ray and only if it hits one of a number of objects it should continue. If it does hit that first it should continue and then report back what else it hit (if anything.)

I actually only care about hit/miss.
Basically out of the rays I shoot I only care about the ones that first go through any of a group of objects and then hit nothing else after.

I found that using any_hit, closest_hit and miss can vary a lot in terms of performance.

Hey @tjaenichen,

This depends on which GPUs you’re targeting and which version of OptiX you’re using. It also depends on what you mean about hitting a first group and then continuing, I don’t quite understand the details yet.

For OptiX version & GPU what matters is whether you’re using OptiX 6, whether you’re targeting any RTX hardware, and whether you’re using the GeometryTriangles API.

For your scene setup, is your first-hit group a static set of objects? Like, do all rays need to test against the same set of objects first before continuing?

For anything before OptiX 6, our advice for best performance with shadow rays has been to use any_hit and immediately terminate the ray in the any_hit program. You might use an any-hit shader that passes through but records your first-hit group hits, and then for any other hit will terminate the ray if the first-hit group was hit. That would allow you to do this with a single ray cast rather than a hit & re-casting a second ray.

For OptiX 6, the best-practices advice has changed, mostly for use with RTX hardware & GeometryTriangles. The fastest shadow rays are now done using closest_hit along with the rtTrace ray flag RT_RAY_FLAG_TERMINATE_ON_FIRST_HIT. Also in that case, you should disable any-hit completely by using one of the *_DISABLE_ANYHIT flags either on your instances or as a flag to rtTrace(). Any-hit programs are enabled by default, so you have to opt-out if you don’t need them and don’t want to incur any cost at all. The more you can stick to the RT cores during traversal, the better, and the main way to do that is to set it up so that no programs are executed until after the hit is decided.

That OptiX 6 advice doesn’t apply so much to custom geometry that has an intersection program. In that case, the intersection program always has to run during traversal. If that’s what you have, a fourth option might be to put your hit/miss logic in the intersection code instead of any-hit…

For your first-hit group test, if it’s a single group then you might be able to use the visibility mask features of OptiX 6 to break your scene into the first-hit group and the second-hit group. Doing that will be faster than using CUDA code to test which set the hit is in.

Earlier this year at GTC I talked about these things and how to think about the relationship between RT cores and CUDA. There might be things in here that could help you, if you haven’t already listened: https://developer.nvidia.com/gtc/2019/video/S9768


David.

Hi David,

you’re using OptiX 6, whether you’re targeting any RTX hardware, and whether you’re using the GeometryTriangles API.

Yes to all of that! All the scenery is static as well.

Just to understand, I am not actually rendering anything. This is really about visibility of objects. Can this thing see that, and in this particular case, can the ray shoot through this object and what does it collide with after. (Sorry not sure how precise I can be, NDA and all that)

This was also a general “how to think in Optix” question, so those are some great insights! I also thought about the problem some more and figured out a way to simplify the whole problem. But this type of question will come up again. (absorb, recast)

How costly are closest_hit vs. miss then? Technically I wouldn’t even need miss if I only set the payload when it actually hits something. Also I only provided any_hit when actually needed (which so far has been rare) but I try the flag where I can.

And no, haven’t heard that talk yet. I’ll check it out, thanks a lot!

Cheers

Thorsten

Yes, understood about not rendering. Visibility only queries are, generally speaking, a higher performance workflow than shading & rendering, so you should be able to get some pretty good numbers! And definitely don’t even worry or apologize for being vague; don’t describe anything here on the forum that flirts with your NDA in any way.

The cost of closest_hit and miss should, roughly speaking, be very similar. The main perf hits in those programs is how much memory is accessed, how many registers are used, what the payload size is, etc. If you can compute visibility without using an any-hit, and just set a single flag in your payload in either closest_hit or miss, that would probably be the ideal approach. My advice is just to prefer one of either closest-hit or miss over any-hit. (edited)


David.

Excellent, thanks for clearing that up. Intuitively I’d have thought that closest would be slower than any. Again, thanks a lot for the insight!