Using tensor cores in Optix

Hi, I am interested in using tensor cores in Optix (mostly by using wmma instructions), is that possible?


1 Like

I, too, would be interested in this. Ray tracing of shapes followed by a neural computation within the same megakernel would be a really exciting feature. That said, my guess is that the current answer is ‘no’, since wmma couples adjacent SIMD lanes in a way that isn’t allowed by the OptiX execution model.

That’s exactly right – the OptiX single-thread programming model prevents use of the CUDA intrinsics that communicate or synchronize across threads in a warp or block.

The primary reason is that OptiX is allowed to move thread lanes during execution, so you can no longer rely on your OptiX thread id to imply anything about what your neighbors are or what order the threads are in.

From the OptiX Programming Guide:

“For efficiency and coherence, the NVIDIA OptiX 7 runtime—unlike CUDA kernels—allows the execution of one task, such as a single ray, to be moved at any point in time to a different lane, warp or streaming multiprocessor (SM). (See section “Kernel Focus” in the CUDA Toolkit Documentation.) Consequently, applications cannot use shared memory, synchronization, barriers, or other SM-thread-specific programming constructs in their programs supplied to OptiX.”

That said, you can put a dependent CUDA launch immediately after an OptiX launch. You can freely mix the CUDA kernels and OptiX kernels in your streams. This means you can run Tensor Core ops on something that an OptiX kernel outputs, you just can’t run Tensor ops inside your OptiX shader programs, inside the same ray tracing “megakernel” as you put it.


1 Like