Let me put it clearly, your idea of ray tracing triangles in 2D using any 3D ray casting SDK is not going to work.
That is like hitting a 3D triangle edge-on which is ill defined due to finite floating point accuracy reasons and everything David explained.
Whatever the outcome is, that is implementation dependent and not a suitable solution.
The approach you suggested looks very interesting but it doesn’t seem cost-effective - 6 representative triangles for one triangle.
If you want to use the ray tracing hardware for finding intersection points between the 2D triangle outlines and a 2D ray, extruding the 2D edges into 3D instead is a feasible solution when using 3D ray casting SDKs.
Representing each edge with two triangles will result in the fastest ray-primitive intersection performance on RTX boards because the ray-triangle intersection will be handled by the hardware RT cores.
If you’re concerned about memory consumption, how many 2D edges do you need to support and how many rays do you need to shoot into the scene to find intersections?
It would also be possible to implement a custom intersection program which would take your 2D line data and extrudes it into a quad, though that would only result in two instead of four vertices but the axis aligned bounding box around that would still need the same information and ray-custom-primitive intersections will run on the streaming multiprocessors instead, which will be slower than the hardware ray-triangle intersections running on the RT cores. Though since you know that the intersection is 2D only, you could optimize that custom intersection program a little.
And I’m also not sure if this approach aligns with our needs yet.
What exactly are your requirements then?
If you’re insisting to handle this in 2D, you could also simply not use any ray casting SDK and calculate the 2D ray-line intersections yourself and completely parallelize that with native CUDA kernels.
Determining a 2D ray-line intersection is simple linear algebra and with some additional spatial structure on top, like a quad tree or a BVH in 2D, you could reduce the number of required ray-line intersection tests.
Done right, that should be the most memory efficient solution and would still be accelerated by the GPU streaming multiprocessors (and should be faster than the above mentioned custom primitives method).
This is not the first time this exact same topic has been discussed on this forum and extruding edges into two triangles perpendicular to the 2D plane is simply the first option when using 3D ray casting SDKs for that.
https://forums.developer.nvidia.com/t/how-does-optix-code-compilation-work/218678/17
https://forums.developer.nvidia.com/t/using-optix7-in-2d-space/182505