How can i set size for distributed render?

I have many GPUs and I want try to divide the buffer size and run parallel RT render, but I don’t know how can I set to divide task when use Optix SDK. Please help me, I am not so familiar with CUDA, but I have passion to do this thing :)

Hi 0xzhang,

How many GPUs do you have? Are they on the same machine, or different machines over network?

OptiX provides a transparent layer for multiple GPUs on the same machine. Calling setDevices with multiple GPUs will let OptiX to manage these GPUs for you. Then launching an OptiX kernel, the threads will be automatically dispatched to different GPUs to execute.

More details here:


Hi, yashiz,

Thanks for your reply. I think maybe I said above is more likely divided every frame to use multiple GPUs parallel render, and at last combine all subframes. And just now I have read a artile, it express that we can not total manually control GPU device by using Optix. So I think now I can only distribute different frame to schedule task but I cann’t free distribut one frame. Is my understanding correct?


Yes, OptiX 6 will distribute a single launch among multiple compatible active devices automatically, so effectively all devices in the OptiX context work on the same frame at a time today.

Hi! Thanks for your reply.

I think now I can assign continuous frame to different device, but I need let them repeated create and operate the same Context, through it result in many repeat calculation, but I think it maybe bring better rendering effect for a realtime render scenario, because mutiple device can overlap overhead of communication or compress buffer etc. Am I think right? Is there any better advice for the efficiency of realtime remote rendering?

For realtime rendering you would rather want to have all GPUs working on one frame to reduce latency.

You cannot target either one or the other device with OptiX 6 in a single OptiX context. You would need two contexts in one process for that. In the past that wasn’t actually working at all, but it should have been improved.
Note that the OptiX 1 to 6 API is also not multi-threading safe so interleaving things on one context isn’t working either.

If this is for realtime gaming, you might also want to look at the NVIDIA Vulkan Raytracing extensions or DXR.
You wouldn’t need any interoperability to CUDA there because the raytracing is in the graphics API itself.

If your project is not on a tight schedule, I would recommend to keep following the NVIDIA announcements in the not so distant future.
Teaser post here: [url][/url]

Thank you very much!

I’m a junior student major in CS and I have some experiences on parellel programming on supercomputer. In this summer I have a chance involved in a rendering project. So I am thinking make our work more efficiency. And yes, we also try to use Vulkan to implement the project. Thanks for your advice!