Is there any way to directly use hardware accelerated ray triangle intersection in CUDA without using OptiX? This is analogous to how it is possible to use tensor cores directly in CUDA for small matrix multiplication, see the Programmatic Access to Tensor Cores in CUDA 9.0 section of https://developer.nvidia.com/blog/programming-tensor-cores-cuda-9/.
It seems like this should at least be possible in an unsupported way using inline PTX (maybe via some reverse engineering of OptiX binaries).
More generally, are any subcomponents of OptiX usable without using the full pipeline? Like can OptiX just be used to generate an acceleration data structure which is used separately? Or can RT cores be used to traverse a custom BVH not generated by Optix from within device code?
This is a similar idea to the question here, but that question is very old (2010).