Performance difference between using optix and cuda for non raytracing

Hi if my application has multiple passes, is it much better to switch to pure cuda when I run a pass that doesn’t require raytracing or is it okay to just call optixLaunch again?


Welcome @elleringtonp!

It is perfectly fine to call optixLaunch with a raygen program that just does some CUDA computation, and never calls optixTrace(). There are some minor differences between an OptiX launch and a CUDA launch that you might or might not care about. If you don’t care about these, then feel free to use whichever form is easier for you, neither will be much better than the other.

OptiX launches automatically copy your launch params to constant memory. If you use launch params then this is a convenience. In CUDA you might need to find another way to transfer your launch params especially if you have a lot of them. If you don’t need any launch params then the only downside to using OptiX is that the launch params copy still happens and has a tiny overhead of a few microseconds. This will likely only matter if you are trying to do many thousands of launches per second, and probably won’t matter at all (or even be easily detectable) if you have only one or a small handful of launches.

The other thing to be aware of is that CUDA launches allow use of shared memory and warp/block intrinsics, where OptiX launches require a single-threaded programming model. So if you want to do any of the kinds of fancy thread synchronization that CUDA allows, then using a CUDA launch would be preferable to using an OptiX launch. If you’re not using any such thread synchronization features, then there’s really almost no difference between the two launch types.

Some other minor considerations:

  • A CUDA launch doesn’t require an SBT, while OptiX does
  • Inlining of functions may differ between CUDA and OptiX
  • OptiX caches your compiled launch code a little differently than CUDA
  • OptiX offers features like specialization and callable programs you can use in your (raygen) kernel
  • OptiX 7.4 now offers more fine-grained parallel compilation than CUDA, if compile time is a concern

If that’s detail overload and you don’t know if you should care about these things, I recommend ignoring this and choosing whichever is most convenient right now - it’s not likely to matter and it’s not hard to switch later.


Thank you for the detailed info David. I will keep using optix for now then and maybe switch to cuda later. I’m not compling my C++ code with nvcc so it makes calling cuda a little trickier than normal.


1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.