Optix - Rays vs Pixels, Multiple rays/pixel


I’ve started exploring Optix , using Optix 3.8. I’ve browsed the samples (#1-8)/tutorials and am going through the documentation; and focusing on the examples Whitted and Cook for reference on full fledged implementation. However, I believe I am missing few basic things - so it maybe a beginner question, however did not find clear description in the programming guide or an answer in this forum:

  1. Relationship between rays,pixels and launch index: As I understand, the basic approach is to launch one (backward) ray per pixel, and on intersection recurse for shadows/reflections etc. This is achieved by launching contexts with a 2d grid (i.e. considering the pixels form a 2d grid) with m_context->launch(entry_point,width,height). This invokes my ray generation program width*height times, and since my Context Launch is 2d the launch index is also going to be 2d with [launch_index.x=0 to width-1 ; launch_index.y=0-height-1] - thus mapping launch_index to the pixel identifier. Is this understanding correct?

  2. If yes, how do I launch multiple rays per pixel (e.g. for sampling)? Should I trace multiple rays within my camera/raygen program, or launch the raygen program multiple times/pixel? Or should this be done in the closest hit/shading program?I assume the Cook sample program does multiple rays/pixel - but am not able to spot exactly how.
    I understand the path_tracer and mis_sample programs do this for path tracing - but I’m just trying to extend the ray tracer with multiple rays/pixel (and not writing a path tracer). What’s a good Optix sample/example for this?

  3. What’s the exact difference between GLUTDisplay::CDProgressive and GLUTDisplay::CDNone? Could not find any documentation on this.

Thanks a lot for your help. Again, if I’m missing something obvious thanks for your patience in explaining.


1.) Correct. That’s the one primary ray per pixel (launch index) approach.

2.) What you do to sample the pixel is totally your choice. You’re defining the rendering algorithm!
The standard approach is to launch one primary ray per pixel as usual and just sample over sub-pixel locations, basically generating fragments, and accumulating the results.

Any example can be changed to do that, you just need to make your output buffer an input_output buffer with float4 format, add a running number variable you use to seed your random number generator, and use that to generate a fragment location per pixel to shoot your primary ray through, then accumulate the radiance on that float4 output buffer.
The display of that could be done with a GL_RGBA_32F texture which is just mapped onto quad, or you could convert that accumulation buffer to an uchar4 buffer as last step in your ray generation program and use the rest of the OptiX sample framework as is.

Yes, the path_tracer example does this. Since that is following a different path per pixel it automatically samples the pixel area as well as the geometry and BSDFs at different locations and directions. You just need to mimic the part which generates the fragment location and for the primary ray and the final accumulation and you get progressive antialiasing in with any rendering method.

You could of course also shoot multiple primary rays per pixel (launch index) in one launch and accumulate those.
You could even shoot all rays for one pixel or multiple pixels and then accumulate thousands of results with a parallel reduction afterwards, and then launch the next set of pixel of one frame.
That method would be bad for an early visual cue but the end result would remain the same.
This method would actually have the benefit of more per warp convergence. Never tried though. This is only something useful for offline final frame rendering and most graphics applications require interactivity.

Keep in mind that when running under Windows on a device using the Windows Display Driver Model (WDDM) there is a 2 seconds kernel timeout in the OS you need to avoid at all costs or the kernel driver will be stopped and restarted by the OS (Timeout Detection and Recovery, TDR). Tesla boards in TCC driver mode are unaffected by this WDDM timeout.
That TDR is one of the reasons why it’s recommended to do less work more often (shorter launches) than shooting too many rays at once. => The method in 1.).

3.) The GLUTDisplay::CDProgressive mode is used for the OptiX SDK examples which do progressive accumulation. All of them render single images and accumulate.
Examples using it are the path_tracer for antialiasing and global illumination with Monte Carlo sampling, the cook example for motion blur, the progressive photon mapping example also for global illumination, and the whitted example for antialiasing.

And there we have it: The “whitted” example is exactly doing what you’re asking for. It’s a recursive ray tracer with no global illumination and started with default command line it’s doing progressive antialiasing (m_adaptive_aa == true), when started with the -A command it’s not doing antialiasing.
Means if you build that example as debug and single step through it, you’ll discover all required elements to do progressive antialiasing.

Thanks a ton for your input Detlef.

Actually - I’m trying to implement Depth of Field/Motion Blur/(and also soft shadows ) using Optix, based on Distribution Ray Tracing (Cook’s method). What I understand from the literature is, to use jittered sampling per pixel - like 22 or 44 samples per pixel, with jitter based on random noise.
I’ve been going through the Cook sample, and was having trouble figuring out how do they do that. I guess the approach in the sample is as you described:
“The standard approach is to launch one primary ray per pixel as usual and just sample over sub-pixel locations, basically generating fragments, and accumulating the results.”
I’m referring to dof_camera.cu in the Cook sample, and what I understand - although in one trace there is only one ray/pixel; both pixels and lens radius is sampled over multiple frames - and thus the accumulation gives the sampled(and averaged) result.
If yes, how do we control the # of samples/pixel? Like how do I enforce 22 or 44 subpixel grid?
Also - for simulating depth of field - do we need to sample both lens and pixel or sampling only lens is enough?
For motion blur - I was trying to simulate a timestep sampling - but I believe the Cook sample does that again by accumulating and simply moving the object?
lastly - I saw that they do pixel sampling in the Phong shading method as well . I assume that is for soft shadows, and not needed for dof/motion blur?

Again thanks a lot for your help.

The dof_camera raygeneration program in the Cook example starts with the general code of a pinhole camera which jitters the fragment location on the pixel with the two random numbers jitter.xy and then changes that to a thin-lens camera where the ray origin and direction are distributed on a “circle of confusion”, actually a disk, with the two random numbers in jitter.zw.

So both the pixel area is stratified with the jitter.xy offsets, as well as the DOF ray origin and direction with the jitter.zw. Both together generates a different image per launch which is accumulated in that ray generation program at the end based on the frame_number variable.

“If yes, how do we control the # of samples/pixel? Like how do I enforce 22 or 44 subpixel grid?”
There is one full image per launch, you simply launch 4 times to get your 22 sampling and 16 times to get your 44 sampling.
The more frames you accumulate, the better the antialiasing and DOF effect. They are coupled in this implementation.

The Cook motion blur example does in fact change the white sphere location for each launch.
Means that renders a full image at a specific time step which results in initial ghost images which fade out when accumulating over very many frames.
Rendering stochastic motion blur with each ray at a different time in a time interval is a little more involved.

The jittering in the phongShade() function is sampling a disk light to create soft shadows. That is sampling an area light, nothing to do with pixels there.
Though using the same random numbers jitter.xy for the pixel area sampling and the light area sampling is a random number correlation which is generally bad in Monte Carlo ray tracers. In this specific case it won’t hurt much. Stuff will flicker anyway with this rendering method until the results converged enough.

Thanks a lot Detlef - your explanations are very helpful.