In my scene, I need render two pass by using the same meshes. So, could I use two threads for the two passes ?
The optixLaunch calls are asynchronous since they have their own stream argument.
Even multiple optixLaunch calls from different threads with different streams on the same CUDA context are possible, but currently that specific case requires separate OptixPipeline arguments though. Explained here:
EDIT: Thinking about that, I actually wouldn’t expect parallel launches of the same pipeline on different streams to work. It’ll break the launch parameter contents which get copied to constant memory and you can’t change them when using the same pipeline. Means always use different pipelines when launching to multiple streams in parallel on the same CUDA context, irrespective of doing this from a single or multiple host threads, until future OptiX versions say otherwise.
If your renderer passes use different pipelines anyways that should just work.
But if the two passes are depending on each other in some way, e.g. accumulating to the same result buffers, then using multiple streams doesn’t make sense.
Also note that the default stream zero has a different synchronization semantic. Means you need to use explicitly constructed CUDA streams for these parallel asynchronous launches. You can also create the CUDA context to have secondary streams be not synchronized against the default stream zero.
If that actually shows any performance benefit depends on the workload and the underlying hardware.
I can easily load a GPU to almost 100% with a single thread and single stream when using OptiX 7 for big enough workloads, at which point a second stream on the same GPU wouldn’t make anything faster.
Your mileage may vary.