More efficient to set tmax<0 or reduce query size in Optix Prime?

I use Optix Prime to shoot a 512x512 set of rays. Some of those will be hits and others misses. Out of all the rays that were successful hits, I want to then launch a secondary ray from its resulting hit point in some new direction. The easy way from a coding perspective would be to

  1. For each ray hit, update the ray origin with the new hit point
  2. For each ray miss, set the ray tmin to 0 and tmax to a negative value

The resulting query could end up having a lot of dummy ray launches (tmax < 0) which will always guarantee a miss. Will this result in a significant performance hit compared to say, using stream compaction to define a smaller query made up only of the successful ray hits?

Thanks.

The smaller query is probably worth it if the time for stream compaction is neglected. Needs measuring, though. Also depends on how many misses there are (not worth it for a single miss), and whether misses are coherent in a big block, like an object against an empty background.

Thanks for the reply. In the mean time I did a bit of testing, and it looks like removing the missed ray hits from the list makes for a big performance impact. A typical case involved roughly 4 million ray launches (subdivided into 512x512 sized OptixPrime queries) with around 1-1.5 million resulting hits. Then I measured the impact of culling the ray list using thrust::remove_if(). The overall throughput went down by more than 30%, and that didn’t even include the secondary ray launch. Including the secondary ray launch on top of that reduced overall throughput by roughly 50%. On the other hand, my original scheme of setting tmax<0 for missed rays reduced overall throughput by 25%. So it seems that reducing the list of rays is not beneficial from a performance perspective. Maybe thrust stream compaction is too inefficient? This had to involve a zip iterator such that the list of rays and hits would be iterated over in tandem.

It sounds like you’ve shown that the thrust::remove_if() operation is pretty slow on its own compared to tracing rays. But what about the secondary launch in isolation? Just for clarity, can you provide just the times for the secondary launch with thrust::remove_if vs the secondary launch after setting tmax<0?

This is something I am very interested in as well. I am iterating over a fixed number of reflections. Each time through the loop some rays might escape and need to be ignored. So either I need to resize the ray buffer for each reflection iteration, or as the OP suggested, set the tmin/tmax to disable the ray.

It would be great if Optix Prime could recognise such a setting of tmin/tmax and completely skip the ray.