I have been an avid user of the VPI library for some years now. I have recently upgraded to Jetpack 6.1 (VPI 3.2) from Jetpack 5.1.2 (VPI 2.3.8) and use the dense optical flow algorithm in my project.
I notice in the new version that there are many more parameters to be set ( vpiOpticalFlowDenseSetSGMParams()) and that the elapsed durations for the benchmarks are much different to the older version (especially for smaller grid sizes).
I have attached a screenshot of the 3.2 benchmarks and the 2.3.8 benchmarks. I have also found this out in my testing: in 2.3.8 on nsight-systems there are two optical flow operations (I guess one for each image) which take roughly 4ms per operation whereas now they take ~17ms per operation with the same settings (HIGH quality, 4 levels and a gridSize of 2).
I notice that there are many more parameters to tweak for the new version. Can you tell me what the defaults of these parameters are set to (this is not present in the documentation) and how they compare to the previous version.
Shoud I expect that the new Dense Optical Flow algorithm gives much higher quality than before (hence I could reduce the quality parameter)? Also, should the algorithm perform better with diagonal movement since we now have a “includeDiagonals” parameter in the VPIOpticalFlowDenseSGMParams struct.
Would it be possible to supply me with SGMParameters to give comparable results (in terms of speed and quality) to what I had before?
Any more questions, please, do not hesitate to ask.
As you can see, these two optical flow APIs are designed for different hardware.
The version in VPI 2.x uses the NVENC, which typically leverages the flow estimation function in the hardware encoder.
In VPI 3.x, the API instead uses a new dedicated hardware called OFA.
It’s expected that OFA can generate results with higher accuracy.
To balance performance and accuracy, you can adjust the VPIOpticalFlowQuality value:
Thank you for your response and thank you for the link to the OFA chip.
However, can you please confirm, you mentioned that the 2.x library is designed for the NVENC hardware.
I appreciate that the benchmark in my screenshot mentions the NVENC (which I have indeed used on an Xavier NX before for optical flow) however, surely the OFA is used when I specify VPI_BACKEND_OFA when I execute the dense optical flow function in a 2.x version of the library on an Orin NX?
When specifying VPI_BACKEND_OFA when executing vpiSubmitOpticalFlowDense I get no errors and experience similar execution times as to what is mentioned in the benchmark and anecdotally from my previous response (roughly 4ms per operation in 2.3.8 whereas now they take ~17ms per operation with the same settings (HIGH quality, 4 levels and a gridSize of 2).
Since my previous post, I have found that I can indeed use lower quality settings to achieve faster results with no apparent degradation in results. This implies that the new implementation is better than the previous one.
However, I would appreciate if you could give me a technical answer as to what advantages I might expect from the new version and how the SGM parameters can be tweaked to get different results.
What I am looking for really is more context on the new implementation of this function and the new parameters available. Even giving the current default settings (which are not mentioned in the documentation) for the SGM parameters and a small amount of explanation would help a lot.
Sorry that my previous message is not clear enough.
We do add OFA support from VPI 2.3.
The support depends on the hardware.
For the Orin series, as Orin’s NVENC doesn’t support dense optical flow, VPI uses the dedicated OFA hardware instead.
Please find the below API document for the SGM parameter: