You cannot run existing deep learning inferencing networks inside an OptiX launch because OptiX controls the work scheduling.
“… applications cannot use shared memory, synchronizations, barriers, or other SM-thread-specific programming constructs in their programs supplied to OptiX.”
But hopefully all CUDA inferencing kernels make heavy use of exactly that.
Means you either would need to implement the inferencing inside an OptiX device program or do the inferencing outside of OptiX.
A possible option would be a wavefront rendering approach, for example:
1.) Generate a number of primary rays into CUDA buffers,
2.) launch that ray wavefront using OptiX,
3.) evaluate what you hit inside OptiX,
4.) store all data per ray required by the inferencing into CUDA buffers,
5.) run your deep learning inferencing on these CUDA buffers outside of OptiX,
6.) generate continuation rays depending on the outcome,
7.) goto 2.
In OptiX 7 all data transport happens via native CUDA buffers.
Inside the OptiX SDK 7.0.0 you can find an example named optixRaycasting which shows how to do steps 1.-4.
Issues with this approach:
- Performance! How long do you expect that inferencing step to take?
- Memory accesses. Should be avoided as much as possible for optimal ray tracing performance. The RTX hardware can handle up to 10 GRays/s, but then you can only read or write about 64 bytes/ray.
- Occupancy. Not all ray paths have the same length. You would need to handle that.
- Launch overhead.