Hi,
I am currently investigating whether a path tracer using OptiX 7 can benefit from techniques like regeneration (Path Regeneration for Interactive Path Tracing) or stream compaction (Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU | Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics). These techniques address the problem of threads being idle in a warp, in the situation where each thread computes a single path. For example, when Russian Roulette is used not all paths have an equal length. This results in warps containing both idle threads (done with the path) and active threads (not done with the path). The idle threads waste resources.
The OptiX Programming Guide states:
For efficiency and coherence, the NVIDIA OptiX 7 runtime—unlike CUDA kernels—allows the execution of one task, such as a single ray, to be moved at any point in time to a different lane, warp or streaming multiprocessor (SM). […] Consequently, applications cannot use shared memory, synchronization, barriers, or other SM-thread-specific programming constructs in their programs supplied to OptiX.
The techniques I mention are designed for architectures that schedule entire warps at a time. Reading the statement from the programming guide, it seems to me that OptiX 7 may schedule threads independently. If this is the case, do I correctly understand that the problem the techniques address is solved by the OptiX 7? And is there perhaps a resource that describes the scheduling of threads/warps in OptiX 7 in more detail?
Thanks in advance,
Nol