Request for the clarification of the "Single Ray Programming Model"

Hi OptiX community, I have a question regarding OptiX’s Single Ray Programming Model. From my understanding, this model requires users to program shaders at the granularity of a single ray. However, this design raises concerns about potential load-balancing issues.

For example, consider two threads casting two rays, Ray1 and Ray2. Suppose Ray1 produces 10 intersections, while Ray2 generates 10 million intersections. In such a scenario, will the AnyHit/IS shader process each intersection sequentially by the thread that casts the ray? Or the “Single Ray Programming Model” totally decouples from the concept of thread, which only means programming on the single ray granularity and OptiX can freely use any number of threads to execute the shaders.

Thank you!

Hi @pwrliang,

By single ray programming model, we mean shaders triggered by a single ray are called by the same thread that cast the ray. OptiX does not use more threads to execute the shaders for a single ray. Single ray programming model also means you are limited to a subset of CUDA, it means no use of shared memory, and no use of block, block-sync, or warp-sync programming.

Divergence is always a potential concern on the GPU, including OptiX, so your pathological example would still be problematic (even when using the CUDA programming model). For maximum performance, we recommend not using AnyHit if possible, and using hardware accelerated triangle intersection if possible. If you do have high divergence, we also recommend using Shader Execution Reordering in order to put threads with similar workloads together into warps. https://raytracing-docs.nvidia.com/optix8/guide/index.html#shader_execution_reordering#shader-execution-reordering


David.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.