How to adapt optixTrace considering previous intersections

The ray-triangle intersection tests are really fast on RTX boards and the BVH traversal is also running in hardware.
If your volume representation consists of tetrahedrons, I would simply try using that data directly.
The bounding volume hierarchy of the acceleration structure on top of that is already handling finding the required primitives directly.
If you’re able to find the entry and exit points of a tetrahedron, the interpolation of its per vertex values for marching doesn’t require any additional rays.
You might not even need to do that inside OptiX device code. If you use OptiX for finding all these tetrahedron entry and exit points along with the tetrahedron primitive index, everything else could also be calculated inside native CUDA kernels, which offer some more flexibility about how to use data, e.g. shared memory which is not possible to use inside the OptiX.

Using an epsilon offset is problematic with coplanar surfaces.
Read this thread, esp. my last post in there: https://forums.developer.nvidia.com/t/radiation-physics-problems/42281/13

Maybe also check these posts which discussed tetrahedrons in the past:
https://forums.developer.nvidia.com/search?q=tetrahedron%20%23visualization%3Aoptix