We are currently using Polygons for our geometries, hence we use a custom intersection test.
Comparing the results with our double precision engine for CPU, we could make out that single precision leads to quite a few misses, there is also the possibility for false-positives, which I couldn’t evaluate yet.
Depending on certain checks within the intersection test, a call back function using double values should be used instead.
Now for the long term we want to switch to triangles to make use of the hardware accelerated triangle intersection test. Since there is no way to modify it, would an approach like this even be possible?
That callback would effectively be the custom intersection routine itself in OptiX 7.
While the AABB you need to give to OptiX for a custom geometric primitive is defined with floats, that can be generated over double precision vertex data by rounding the bounding boxes to the outside to floating point precision to definitely cover the geometric primitive. The BVH traversal over that will happen on the hardware RT cores on RTX Turing boards.
For each hit AABB it will then callback into your your custom geometric primitive intersection routine which you can implement with double precision.
The double precision intersection distance could be stored as an additional attribute, split to two of the eight 32-bit attribute registers in OptiX 7.
Access to the original object space coordinates can be implemented via the SBT record header data field, so that you have the double precision position and other vertex attributes available.
If there are instance transforms involved those are also stored as float matrices in OptiX and would require double precision equivalents stored with your SBT records as well, or model in world space.
The issue with that approach is that double precision performance can vary greatly among different GPUs, with Volta being the fastest and Turing being slower in comparison. It’s tuned for 32-bit floating point performance.
You can check that with this CUDA Driver API call on your GPU:
cuDeviceGetAttribute(&singleToDoublePrecisionPerfRatio, CU_DEVICE_ATTRIBUTE_SINGLE_TO_DOUBLE_PRECISION_PERF_RATIO, ordinal);
Please note that the OptiX internal triangle intersection implementation in software and hardware has taken care to do the internal math and floating point rounding such that ray-triangle intersection is “watertight”. This means you won’t have spurious false negative misses in between two triangles that share an edge, or at a vertex that touches two or more triangles. This is not equivalent to using doubles, but a correct watertight implementation in float can be superior to a naive intersection done using doubles, if the problem you’re solving is incorrect misses in a surface that does not have holes. I’m not suggesting anything is wrong with your intersector, since I don’t know anything about it, but I personally have written a triangle intersector using doubles that is not as good as the RTX float intersector. If you weren’t specifically rounding for watertightness in your float intersector, then you might find that RTX triangle intersection resolves some of the same issues your double precision intersector solves, and I suspect you will also see a decent speedup on pre-Turing hardware and even bigger on Turing hardware.