NVIDIA Optical Flow SDK - Motion Vector output

Hello there and good day!

Really impressive work with the Optical Flow SDK (https://developer.nvidia.com/opticalflow-sdk), really enjoyed seeing the FLO vectors and noting how the Turing GPU architecture can really accelerate the motion vector estimation from multiple frames.

Had some questions about the SDK I was hoping someone could shed some light on:

  1. In the file Includes/nvOpticalFlowCommon.h, I see here that we can use the NV_OF_PERF_LEVEL_MAX as well besides the slow/medium/fast mode. Is it possible to use this? What would this give us in terms of the output?
  2. How can we retrieve the RAW 4x4 motion vector (before the slow/medium/fast processing is implemented on the motion vector)
  3. For the NV_OF_OUTPUT_VECTOR_GRID_SIZE, what is the NV_OF_OUTPUT_VECTOR_GRID_SIZE_MAX used for? Can we use this?
  4. I understand that the slow post-processing uses CUDA to perform some processing on the RAW motion vector. So what does the medium/fast processing do to the RAW motion vector? Does fast mean that no post-processing has been implemented on the motion vector (in which case this is the RAW motion vector from the hardware) ?

Thank you so much for your time and I look forward to learning more about this fantastic SDK :D

Thank you for providing a feedback.

To answer your questions:

  1. Please refer to section 4 in Optical Flow SDK’s Application note. That explains what a preset is. In simple words, preset is a distinct operating point which configures the hardware to trade off between output quality and performance. SLOW preset configures the hardware to run slower and give better quality vectors, whereas FAST preset work the other way round. As the name suggests " NV_OF_PERF_LEVEL_MAX " is not a preset, it is the maximum enum value. Only 3 presets are exposed, you can refer to Table 2 in Optical Flow SDK’s Application note which gives indicative results of quality and performance of all of them.

  2. All the presets give out raw motion vector for a 4x4 cluster of pixels called “grid”.

  3. “NV_OF_OUTPUT_VECTOR_GRID_SIZE_MAX” it is the maximum enum value and should not passed to the driver.

  4. In case of SLOW preset after the NVOFA gives out the flow vectors, driver does some further processing on those vectors. Also, SLOW preset uses a different approach to motion search than MEDIUM preset which gives better quality flow vector.

It is generally recommended to go through the application note and the programming guide before starting to use the SDK.


Hi there,

Thanks for your replies! I’ve gone through the documentation as you suggested and have found more information there to deepen my understanding of how this algorithm works.

Currently I’m using a Turing GPU on my system and have also started to play around with the Video Codec SDK. As I’m trying to learn more about what it can do I was wondering on some of the differences between the Optical Flow SDK and the NVENC ME-only mode:

  1. OFA returns a granularity of 4x4 but ME-only NVENC returns 1 MV per 8x8 even for a Turing GPU card. Does NVENC use OFA for motion estimation or a different accelerator?

  2. If OFA is used, is it possible to get a 4x4 output in the NVENC ME-only mode then?

Thank you for your time!

OFA uses an algorithm specifically targeted for finding optical flow of blocks/pixels in the frames. Whereas, NVENC ME-only mode uses an algorithm for minimizing encoding cost (e.g. bits per pixel). These are different objectives, and hence different vectors get generated.