VPI Pyramidal LK Optical Flow Poor Tracking Results

I create a class wrapping the VPI PLK offerings and I am disappointed by the tracking results. When tracking a sparse set of points on an image to itself (so, 0 motion and with useInitialFlow set to 0), every single feature point is tracked successfully. However if I warp the target image ever so slightly (small rotations, translations, shear, etc.) many of the feature points end far from their starting positions and so give garbage solutions. This is the same with both CPU and CUDA backends.

This first image shows the tracked points color coded by angle.

This second image show the inlier set from an affine transform.

This third image shows the two input images and the diff after warping:

Does anyone else experience this? Are there know bugs in this library? Or, maybe more likely, any thoughts on what I am doing wrong?

static void vpi_track(const VPIImage& i0_wrapper, const VPIImage& i1_wrapper,   
                      VPIImage& gray0, VPIImage& gray1, VPIPyramid& pyr0,          
                      VPIPyramid& pyr1, VPIArray& pts0, VPIArray& pts1,         
                      VPIArray& status, VPIPayload& plk,                        
                      VPIOpticalFlowPyrLKParams* plk_params, VPIStream& stream, 
                      VPIBackend backend) {                                        
  try {                                                                          
    // Convert to grayscale                                                        
    vpi_check_status(vpiSubmitConvertImageFormat(stream, backend, i0_wrapper,   
                                                 gray0, nullptr));               
    vpi_check_status(vpiSubmitConvertImageFormat(stream, backend, i1_wrapper,   
                                                 gray1, nullptr));                 
                                                                                   
    // Fill pyramids                                                               
    vpi_check_status(vpiSubmitGaussianPyramidGenerator(stream, backend, gray0,  
                                                       pyr0, VPI_BORDER_CLAMP));
    vpi_check_status(vpiSubmitGaussianPyramidGenerator(stream, backend, gray1,  
                                                       pyr1, VPI_BORDER_CLAMP));
                                                                                   
    // Run optical flow                                                            
    vpi_check_status(vpiSubmitOpticalFlowPyrLK(stream, 0, plk, pyr0, pyr1, pts0,
                                               pts1, status, plk_params));      
                                                                                
    // Wait for processing to finish                                            
    vpi_check_status(vpiStreamSync(stream));                                    
  } catch (std::exception& e) {                                                 
    LOG(ERROR) << e.what();                                                     
  }                                                                             
}

I can paste more code if needed. But again, everything is “great” if there is zero motion between image0 and image1.

Hmm, seems that the parameters don’t exactly match OpenCV and so more pyramid levels and iterations are needed than expected. I’m also having no luck with VPI tracking across time, as in images taken from the same vantage point at different times. I first run an edge magnitude operation on the images to avoid gradient reversal issues. OpenCV PLK responds well to this preprocessing step and usually tracks successfully:


and

VPI PLK fails miserably at this task, no matter how I parameterize. My guess for the discrepancy is that epsilon in OpenCV corresponds to feature motion from one iteration to the next, whereas in VPI it corresponds to avg. L1 over the feature window. I think the solution is to change the meaning of epsilon in VPI to match OpenCV and to add a new parameter to allow “successful” tracking of features with much higher avg. L1 differences, something like maxAvgL1Error.

I’d like to also complain about the name windowDimension. It’s odd to me that it can be an even number. Is it a square radius or a square diameter? Please improve the documentation.

We had some issues with the optical flow as well. Increasing the pyramid level to 3 or 4 gave much better results. Also, we use VPI 1.1, I wonder if on newer versions the performance is better.

I’m using 2.0, so it seems like not much has changed. It’s a shame there’s no github issue tracking or similar for this library.

@lee.schloesser - Forwarding this to the VPI team to investigate this matter. We will get back to you soon. Thanks for your patience!