Optix denoiser compute device options

droettger · June 10, 2020, 8:04am

OptiX 7 has no knowledge of multiple devices!

That’s finally completely under the developer’s control and happens all inside the CUDA host code of your application.

Means you normally create a CUDA context per device, and an OptiX context per CUDA context and these are completely independent from OptiX’ point of view. Everything which should happen between boards is pure CUDA code.

If you do exactly the same OptiX API calls on both contexts, there will be different kernels for the heterogeneous devices, acceleration structures will be different (incompatilble, cannot be relocated from one to the other device in this case), and probably some more things.

One of my OptiX 7 examples does that, though that has not been tested with a heterogeneous GPU setup.
The preferred setup for multi-GPU should be same board types and best with NVLINK connection.

I would expect that the only rendering distribution strategies which work with that are obviously the single-GPU one and possibly the multi-GPU zero copy (pinned memory) strategy.
All other rendering distribution strategies implemented there so far require copies between the two devices for final display and I do not know if that works with a heterogeneous GPU setup the way I implemented it.
Link here: https://forums.developer.nvidia.com/t/optix-advanced-samples-on-github/48410/4

That example also contains two methods (one for Windows and one for Windows and Linux) to figure out which device is the primary OpenGL device to make CUDA-OpenGL interop working.

Anyway, when handling these two devices separately you can distribute the work as you like, which esp. in a heterogeneous setup would require to do some load balancing to make sure the slower board doesn’t bottleneck the rendering speed.

Also the denoiser will run differently, as said the RTX will use the Tensor cores, the Pascal obviously not.

Here I would try to run the denoiser only on the faster board on the full image.
That would be simpler than denoising two tiles of the image, one on either board, which requires an overlap area (query with optixDenoiserComputeMemoryResources) between tiles, and then you still need to get the results to the final full image anyway.

If you’re not actually rendering with OptiX but only want to apply the denoiser, then I would recommend to not use two devices at all, just pick the faster one.

Read this chapter: https://raytracing-docs.nvidia.com/optix7/guide/index.html#ai_denoiser#nvidia-ai-denoiser

(OptiX 6 would not allow multi-GPU on your configuration. It will pick all GPUs with the highest compatible SM versions, which would be the RTX board in your setup. It’s also either all boards with RT cores or none.)

Topic		Replies	Views
How to check if Optix denoiser is supported by GPU? OptiX cuda , optix	4	200	June 16, 2025
Fast denoising of high number of low-res images OptiX	3	818	September 22, 2023
optix7 denoiser OptiX	5	1294	June 14, 2022
Multi GPU OptiX	7	3216	June 14, 2022
Nvidia optiX OptiX	2	610	June 14, 2022
[OptiX 7] Tiled Denoiser OptiX	12	1347	June 14, 2022
Optix Denoiser does not work for my Isotropix Clarisse OptiX	2	553	September 21, 2022
Optix 7 denoiser samples and questions ? OptiX	4	897	June 14, 2022
Optix Denoiser high CPU usage OptiX	6	1185	June 14, 2022
Question about handling buffers when using multiple GPUs? OptiX	14	4102	June 15, 2022

Optix denoiser compute device options

Related topics