I’m currently working on a path tracing project using OptiX 4. The path tracer is using Russian roulette to determine when a path should be terminated, so the depth of each path could vary widely. This can cause one thread to wait for a unnecessary amount of time, for another to finish.
I have done some small manual testing, where i can see the depth can go from a depth of 1 to over 100.
My current hack to get around this problem, is to take more samples on each context launch, resulting in a lower average waiting time per thread, which gives me a factor two or three depending on the amount of samples per launch.
This however also brings the time to finish a single launch up significantly, which in turns kinda freezes my computers GUI (something with a Cuda launch can’t run a the same time as the graphics tasks in windows… (not the point))
What I am thinking to do instead is to:
- trace until a maximum of a specified depth and save the tracing information in a buffer. (could be 10)
- sort the buffer in running or terminated paths
- repeat on all paths that’s still alive.
(4.) could input new traces into the buffer, to keep it going.
Does anyone have any experience or insight into this approach, if its possible or if it’s even worth the hassle.