Why Optix Samples run incredibly slow?

My Optix Samples run really slow…0.3fps


I haven’t modified the code apart from using the –no-gl-interop argument because the samples could not run without using it. I have the latest version of the Nvidia drivers on a win10 64bit machine.

Are you running a debug build, or have validation mode enabled? OptiX always runs very slowly when device debugging is enabled. Or maybe you’re using a 2nd GPU not connected to your display, which usually requires disabling gl-interop IIRC, and something has gone wrong with the PCI transfer speeds?


thanks for your response, indeed, I am running on debug mode, but the other thing you mention has just reminded me that my nvidia is new, but my mobo is…pretty old! The pci speed must be slow, I’ll check it out to see it.
I will also run the samples on release mode to see the difference. As about the second video card, it is an onboard video card. Do you think that this might be the reason I am getting the gl-interop problem? Is there something I could do about it?

I believe GL-interop only works when the OptiX GPU and display GPU are the same GPU - the two contexts have to be on the same device for GL-interop to work. So it’s totally normal to have to use --no-gl-interop when you have 2 GPUs.

I’m guessing if the debug build was the main issue, you don’t need to worry too much about your PCI setup, but FWIW, there are some utils to help you diagnose any PCI bandwidth issues. Check out nvidia-smi topo -h. IIRC there might also be a CUDA sample that measures and reports bandwidth across the different links in your system.


If you remove the NVCC option which builds debug device code (-G) from the OptiX SDK example build process, the OptiX kernels will be as fast as in release mode while you still can debug your host code the same way in debug targets.
This post describes how: https://forums.developer.nvidia.com/t/a-problem-when-i-want-to-createmodule/276228/2

My OptiX example framework always builds release device code by default for both release and debug targets.

If the OpenGL context of your application is not running an NVIDIA implementation, then there cannot be CUDA-OpenGL interop with the device running the OptiX/CUDA raytracing code.

There are multiple ways to solve that:

The best would be to change the OpenGL pixelfomat selection to not use the standard Windows functions but the OpenGL WGL extensions to enumerate all pixelformats and pick one which is provided by the NVIDIA driver implementation.
Using these WGL functions instead of the Windows ChoosePixelFormat:

Since the OptiX examples use GLFW as window framework there is probably also a way to set some hints before creating the GLFW window to steer the application to use an NVIDIA OpenGL implementation.
Look at the Window and Context documentation here: https://www.glfw.org/documentation.html

Other methods would force either your application or your whole system to run the graphics on the discrete GPU.

Please have a look into the NVIDIA Display Control Panel application and look at the 3D SettingsManage 3D Settings.
(I’m running Quadro workstation boards and don’t know how this differs on GeForce. I’m currently looking at a laptop with integrated and discrete GPUs.)

There should be Global and Program settings tabs.

Inside the Global settings there is a Preferred graphics processor combo-box then.
Try changing that from auto-select to the NVIDIA processor.
(I have that set to auto-select and my OptiX programs work just fine.)

Inside the Program tab you can do the same for each executable individually.
I’m not sure if you need to do that for your executable or if it’s enough to do that for the Visual Studio executable as long as you start your applications from inside that.

Again, things work just fine out-of-the box on my iGPU + dGPU laptop but it’s a workstation system and with the provided information it’s unclear why your system behaves differently.