Bad optix ray-shooting performance.


I have tested the path tracer sample in the SDK, I obtain 27 fps on nVidia Geforce 1080 GTX (8 Gb GDDR5X) video card. I believe that is very slow (considering also that it render 4 samples per pixel). I have seen some GLSL shaders that can render the Cornell Box nearly 10 times faster (take a look at this for example: [url][/url]. I have tried several acceleration structures (including “NoAccel”) and the result is the same. A question: is Optix very slow when you have deal with few primitives? Or there is some way to optimize the ray-shooting? Can you report here your FPS running the path tracer sample, just to make sure isn’t a problem of my system? (Intel Core i7-8700 4.5Ghz, 16 Gb Ram 4Ghz, nVidia Geforce 1080 GTX 8Gb 10Ghz GDDR5X, Windows 10 Pro 64 bit fully updated, Visual Studio 2015, latest video card drivers, CUDA and Optix SDK installed). Thanks in advance.

I’d say you’re comparing a hardcoded example with a different workload against a general purpose ray tracing SDK example written for demonstration purposes with an insignificant scene size. I would expect that comparison to change quickly with more real-world use cases.

Maybe also give the OptiX Advanced Samples a spin. The ones I wrote should be in the hundreds of frames per second at defaults on your system. Disable vsync in the 3D settings of the NVIDIA Display Control Panel!

I understand that a generic ray-shooting SDK has overheads especially in scene made of very few primitives, but when I use “no acceleration”, should not the SDK sequentially call all intersection routines (just like in the GLSL example)? I expect a bit of overhead here too, but not 10 times! Anyway, thanks for the link, I’ll give it a look for sure (when I return back to home).

Again, I don’t consider this a valid comparison.
If you ported that GLSL implementation to OptiX as it is, you wouldn’t even shoot a single ray in OptiX, because all that code would be inside the ray generation domain! It’s basically a 2D CUDA kernel only.

Instead the optixPathtracer has an arbitrary camera projection, a more complex scene setup, not using box primitives but parallelograms, is doing multiple launches with four samples per path each, has no limit on the path length other than Russian Roulette or reaching the miss shader or light, supports arbitrary many lights, renders to float4, etc.

It’s hard to compare the performance of two different ray tracers, esp. if it’s clear that they aren’t actually implementing the same thing.

Ok, in my first attempt I have ported the GLSL shader to Optix, replacing the intersection routines with calls to the rtTrace. I was surprised by how much slower it was. I don’t think that (in this case) the bottleneck is the path tracing algorithm (in the Optix SDK sample, limiting the ray path length to 3 bounces, like the in shader, the FPS grows only from 27 to 35, so Russian Roulette is not the primary cause), nor the fact that we render to float4. However, just for curiosity, I will try to implement the intersection routines as in the original GLSL shader to see if I obtain a very similar performance.

I am facing same issue: getting lower speed in optix.

agpxnet: what was the result? did you implement the intersection routines as in the original GLSL shader ?

Hi @fatourechi,

Your speed is lower than what? Are you also comparing a ShaderToy sample or GLSL shader to OptiX?

We’re happy to help get the fastest speed possible, and for newcomers to OptiX there are usually some non-obvious things you need to do in order to get maximum performance. In order for us to help you, we’ll need some more information about what you’re doing currently.

I like ShaderToy and I’ve used it a lot, but something very important to understand that Detlef explained above is that you cannot compare a ray-tracer-in-a-shader to Optix, they are very different things. There are definitely some simple demos that you can make run faster in a single shader, but I guarantee that OptiX+RTX is very much faster than anything you can do in a ShaderToy shader once you have arbitrary models with millions of primitives in your scene. (Except for some very specific demo-scene tricks like a grid of repeated primitives on an infinite plane.)

Since this thread is almost a year old and we’ve released a major version of OptiX since then, it is worth starting a new thread if you’d like some technical support making your OptiX code fast. Please consider doing that, and if you do, include some information on your setup (like which version of OptiX, driver, GPU, CPU, and OS you’re using) as well as a description of your rendering algorithm and performance measurements so we can figure out if it’s going slower than expected and if so help improve the speed.


Hi David,

Thanks for your response. I have created a new topic regarding my issue. I experience lower performance in my optix app compare to the same code written in cuda. Any help is appreciated.