Hi there, hopefully I can help.
OptiX is a great choice for pure ray-casting, and I’d say you do leverage main advantages of OptiX, which is the GPU hardware acceleration of traversal and triangle intersection. That said, if Embree gives you what you need, it’s certainly worth considering if it’s adequate for the job, and whether OptiX gives you anything you want beyond that. The main thing OptiX can give you is speed, but that speed comes with a few tradeoffs of the GPU engineering being a little more involved, and the cost of getting data to and from the GPU.
If you’re using OptiX 7, which we recommend, you can look at the
optixRayCasting sample as an example of how to do OptiX Prime-like ray-casting in the new API. The nice benefit of this sample is that it leverages RT cores, while OptiX Prime does not.
To find all the intersections along a ray, you can either use an any-hit program or you can “re-launch” the ray every time you find the next closest hit. I’m not sure which will be preferable in your case, it’s probably worth testing both if performance is a high priority. It can be faster to re-launch the ray iteratively until you miss than it is to use an any-hit program. This is because any-hit programs and ray traversal run on two different sub-units of the GPU, so there’s a little bit of overhead to calling any-hit programs in the middle of traversal. On the other hand, when you want all intersections iteratively, you’re only replacing any-hit calls with closest-hit calls, so it might be a wash in terms of performance.
One big difference between any-hit and ray re-launching is that any-hit calls will give you an unsorted list of hits, while re-launching will give you all the hits in depth order. Think about whether this is important to you, since having to sort can add some expense.
If you are going to send your list of hits to the CPU for processing, that will probably end up dominating your run time. The general recommendation is to find ways to do the reduction of hits as early as possible, and to keep and process as much data on the GPU as possible, to avoid the overheads of data transfers.
Putting pointers in your payload works just fine and there’s nothing tricky about using a pointer. The main issues are how to index the buffer that you’re pointing to, and how to make sure you’re minimizing bandwidth. The recommended approach is to use small payloads and avoid pointers if possible, just to keep global memory bandwidth to a minimum, but having and using a pointer in your payload is fairly trivial to do, no different than any other C or C++ pointers.
You can pretty easily take one of our SDK samples and tweak it to produce all the ray-mesh intersections. For example start with
optixCutouts and look at the loop in the ray-gen program. If you skip updating the ray direction, but move the new ray origin to the hit point right after calling
optixTrace, that will iterate through all hit points along the ray. This would be how to do the ‘re-launching’ strategy I talked about. Otherwise, for an any-hit strategy, all you need is an any-hit program that stores its results.
Does that help clarify things a little?