Slow ray-ray Intersection calculation

I am trying to calculate the points of intersection between around 4000 rays. By intersection, I mean if the closest point between two rays is smaller than a set value, that counts as an intersection.

I tried doing this by first constructing the rays as 8 sided “pillars” (ideally I would like cylinders but this pillars are good enough for my purposes.) So my acceleration structure consists of 4000 instances of these pillars. Then I launched the same 4000 rays and see where the rays and the “pillars” intersect.

The results are fine but it is slow. Using Nsight systems, it says my optixLaunch takes 15 milliseconds on a GTX2060. I already set the build options to prefer fast trace. Are there any ways to speed this up or is this the wrong approach for my problem?

If all you require is the minimal distance between skew lines, I don’t see a need to do that with a ray tracer.
That’s an O(N^2) algorithm you can also implement in a native CUDA kernel, which in your case would require ~8 million tests (half of the full test matrix).

Here’s a link the previous question about this with a link to the formulas:

1 Like

Thanks. I was hoping by presenting the problem for the RT cores to solve may speed things up. Guess not.

The main problem for your example is that 4000 rays is far from saturating a modern GPU.

I’m assuming that these rays have finite length or you wouldn’t be able to build geometry around them.
If you did that with long skinny triangles, that will result in huge axis aligned bounding box (AABB) volumes per pillar if the rays are oriented diagonally.
Means there will be a lot of overlap of AABBs and that effectively means intersection tests with a lot of rays even though they might be far away.
That’s similar to the O(N^2) algorithm then because the BVH isn’t actually helping to partition pillars enough.
In that case you could try building the pillars from more triangle segments along the rays the more diagonally they are oriented to minimize the per pillar AABB volume.

Still I would try a native CUDA kernel anyway, given the rather simple calculations for the skew line distance and maybe the also required closest point along the ray.

Yeah, just tried pillars with many segments to reduce the AABB of the triangles, doesn’t seem to help. Increasing the segments eventually lead to increase in time taken, so back to CUDA kernels for me.

Yeah, when running rays through their own pillars that will defintely increase the intersection tests per ray as well when splitting more. Catch 22.

1 Like