How do you calculate flow vector for denoiser?

Hello,

I’d like to use the new denoiser model features, and I am curious, how do you approach calculation of the map of flow vectors in an arbitrary scene?

Thanks for hints!

Hi Robert,

Perhaps a typical structure for computing flow vectors in OptiX would be to think of rendering them to a buffer using a closest hit program that puts motion vectors into your payload rather than (or in addition to) storing lighting+material shading results. The hit shader might query the primitive ID, vertex and/or transform motion, and hit uv location, and then use that to evaluate the 3d location of the same uv coordinate on the same primitive at one frame in the past, in order to find the previous location of the hit point. This means making your animation data available somehow to your closest hit program (and it could in cases of complex hierarchy mean evaluating the motion of all the moving transforms in your hierarchy). Subtracting the hit point on the current frame from the same point on the surface evaluated 1 frame in the past will then give you the motion vector in 3d world space. You might pass the 3d world space motion vector at the hit point back to raygen, and in raygen you could transform this 3d world space vector into a 2d image space vector, and save the result to your motion vector buffer, which would be then passed to the denoiser.

It may also possible, depending on how you design your animation system, for the primitives to have information about which direction they’re moving instantaneously, and so you may not need to evaluate the position at a previous frame, but you might instead have direct information about which direction a hit point is moving. Between computing instantaneous motion vectors and computing motion deltas between two frames, neither one is more correct than the other, but I believe it’s more common to use the delta between two frames. One benefit of using the delta is that if you use the motion vectors to “reproject” the pixels from one frame to another, you’ll more closely match the target frame than if you try to use instantaneous motion vectors. Some people argue convincingly that using delta motion vectors will give you a cleaner motion blur result than using instantaneous velocity motion vectors. Some reasons to use instantaneous vectors include that they might be easier to compute, or faster to compute, and the resulting quality might be perfectly sufficient.

You could do all this as a separate pipeline, and a separate launch that renders only motion vectors. Or you could, if it makes sense, output motion vectors to a separate buffer during your existing render launch. This depends on how your app is structured and what makes the most sense. If I was writing a production path tracer, I might aim for the separate pipelines approach because I might not always need motion vectors, and because I would likely plan to use 1 sample per pixel for motion vectors, even if I used many (hundreds) of samples per pixel for the beauty (render) pass.

Quite a few renderers are built to output AOVs (or buffers of arbitrary data) during the rendering that can be used for compositing and effects, and motion vectors are commonly supported. You can probably find some additional hints by poking around through the documentation for some different renderers, they do talk about some of the details of how they compute (or sometimes approximate) the motion vectors.

Does that help? Is that the appropriate level of info you were hoping for?


David.

Thanks David!

I was thinking of something like motion deltas, for which I already have some infrastructure. Separate pipeline could be a good idea since all together becomes more and more complex… Anyway, it is a bit of code to write. Especially matching best uv pairs since hits are independent in each frame.

Thank you for your answer. I was hoping there is some magic trick to solve all of this easier, like ML model estimating deltas based on 2 frames. ;) Which actually might be doable, eg. training denoiser stacked on top of delta estimator, with respective cost functions applied simultaneously to outputs of both models…

Thanks again!
Robert

Especially matching best uv pairs since hits are independent in each frame

Maybe this is already clear, but just in case – if I understand that correctly, you might think of it only as a uv hit on the current frame, and you’re calculating it’s world space position for two frames - the current frame and the previous frame, even though there might not be a hit at that uv location on the previous frame – in fact, this uv location might not even be visible on the previous frame! (It could be occluded by itself or a different object, or it could be off-camera.)

I was hoping there is some magic trick to solve all of this easier, like ML model estimating deltas based on 2 frames. ;)

Oh, of course there actually are some magic tricks, I should have mentioned it. You could use an ML model, or even just a plain optical flow method, these do exist if you want to construct motion vectors from two images. They can be finicky and not the highest quality, but it might be really easy to try. We even have an Nvidia SDK for optical flow!

https://developer.nvidia.com/blog/opencv-optical-flow-algorithms-with-nvidia-turing-gpus/


David.

Optical flow in OpenCV looks interesting. I forgot about that hardware has this built in. But it looks like making flow from renderer data will be much more precise. Ok, I get the idea with uv’s… I’ll also play with the SDK example to see how much the flow improves denoing.

1 Like