Ray Mesh Intersections


Looking for some advice.

I am interested in an application which simply aims at returning all intersections of a given ray with a number of meshes. I am just starting out with Optix and trying to wrap my head around it.

Is it the right tool for the job here? It seems like I am using a very low level/small part of Optix (versus switching to something written in CUDA by hand); and perhaps I would not be leveraging the advantages that Optix would bring. I can get the results I need using Embree; and this seems somewhat similar to Optix prime which appears now to be deprecated.

Assuming Optix is an appropriate tool; how would I go about finding all intersections? My reading around suggests using the any hit program; and a pointer in the ray payload to store intersections results (the t value and the ID of the mesh), since the required intersection data would in general exceed the allowance of 8 registers for the payload.

I’ve not managed to find any examples of using pointers in the payload even though the docs suggest it is possible?

So my questions are :

  1. Is Optix the right solution for the given problem?
  2. Is there an example somewhere of storing data in an array; through a pointer in the ray payload?
  3. Even better… is there an example somewhere which already solves my problem; finding all ray-mesh intersections?


1 Like

Hi there, hopefully I can help.

OptiX is a great choice for pure ray-casting, and I’d say you do leverage main advantages of OptiX, which is the GPU hardware acceleration of traversal and triangle intersection. That said, if Embree gives you what you need, it’s certainly worth considering if it’s adequate for the job, and whether OptiX gives you anything you want beyond that. The main thing OptiX can give you is speed, but that speed comes with a few tradeoffs of the GPU engineering being a little more involved, and the cost of getting data to and from the GPU.

If you’re using OptiX 7, which we recommend, you can look at the optixRayCasting sample as an example of how to do OptiX Prime-like ray-casting in the new API. The nice benefit of this sample is that it leverages RT cores, while OptiX Prime does not.

To find all the intersections along a ray, you can either use an any-hit program or you can “re-launch” the ray every time you find the next closest hit. I’m not sure which will be preferable in your case, it’s probably worth testing both if performance is a high priority. It can be faster to re-launch the ray iteratively until you miss than it is to use an any-hit program. This is because any-hit programs and ray traversal run on two different sub-units of the GPU, so there’s a little bit of overhead to calling any-hit programs in the middle of traversal. On the other hand, when you want all intersections iteratively, you’re only replacing any-hit calls with closest-hit calls, so it might be a wash in terms of performance.

One big difference between any-hit and ray re-launching is that any-hit calls will give you an unsorted list of hits, while re-launching will give you all the hits in depth order. Think about whether this is important to you, since having to sort can add some expense.

If you are going to send your list of hits to the CPU for processing, that will probably end up dominating your run time. The general recommendation is to find ways to do the reduction of hits as early as possible, and to keep and process as much data on the GPU as possible, to avoid the overheads of data transfers.

Putting pointers in your payload works just fine and there’s nothing tricky about using a pointer. The main issues are how to index the buffer that you’re pointing to, and how to make sure you’re minimizing bandwidth. The recommended approach is to use small payloads and avoid pointers if possible, just to keep global memory bandwidth to a minimum, but having and using a pointer in your payload is fairly trivial to do, no different than any other C or C++ pointers.

You can pretty easily take one of our SDK samples and tweak it to produce all the ray-mesh intersections. For example start with optixPathTracer or optixCutouts and look at the loop in the ray-gen program. If you skip updating the ray direction, but move the new ray origin to the hit point right after calling optixTrace, that will iterate through all hit points along the ray. This would be how to do the ‘re-launching’ strategy I talked about. Otherwise, for an any-hit strategy, all you need is an any-hit program that stores its results.

Does that help clarify things a little?




Your reply is almost as fast as the raytracing. Thanks; actually the ordering of the ray intersections is necessary (I was planning on just sorting them post casting) so the re-launching option sounds like the best option. I have been looking at the samples you mention; and am going to have to put some decent time in to actually understand the code.

I had better do that before more questions.

Briefly though; what is the general mechanism for getting the “ID” of the mesh being intersected; is there a unique identifier for a given mesh accessible from the programs at all? I see the primitive index of the triangle can be returned; but then presumably there would have to be an extra step of figuring out which mesh it belongs to by its location in the list of triangles.

Great reply, thanks!

1 Like

Getting the IDs can be simple or slightly involved, depending on how your scene is setup. The easiest way to get a mesh ID in OptiX 7 is to put each mesh in a separate GAS, and then call either optixGetInstanceIndex() or optixGetInstanceId(). The difference is subtle, but in short, optixGetInstanceIndex() gives you the implicit index, where optixGetInstanceId() gives you an explicit index that you provide in the instance. If you aren’t using instances then to recover the mesh ID, you could do something like store it in a buffer that you can lookup by primitive ID.



1 Like

Previous forum discussions about getting all hits along a ray listed below. Even if some talk about the older OptiX API before OptiX 7, the main algorithms are the same.

Note that if your models contain coplanar faces from different meshes, some additional care would need to be taken to not miss either of the coplanar faces with the iterative approach using closest hit programs, because the fast way of changing the ray offset after a surface hit to avoid self-intersection wouldn’t work for coplanar faces. You would miss one then.
The more robust but slower approach would be to use some per mesh ID like the user defined instanceId field to ignore identical hits then, but that would require an anyhit program again which makes things slower overall.

To put that into perspective, we’re talking about >10 GRays/sec on current RTX boards when only doing closest hit interactions on opaque materials. You need a lot of rays running in parallel to reach that.