Basic VPI questions

Hi, I’m looking for using the VPI Optical Flow LK Tracker. I looked at the example on the site and I have some questions.
Beforehand I’ll specify that what I need is that the tracker will work asynchronously and without needing to download/upload data on the GPU. I’m working on a Linux 22 environment.

  1. First of all, is there an easier alternative to install the vpi2-dev package? The method I used thus far is to install via repository for Jetson, however I’m working on a standard PC and the installation requires me to change system files afterwards as it is designed for CUDA version 11.4.
  2. Will the 4.5x speedup compared to openCV outlined for this algorithm will still apply on a PC?
  3. Will there be support for BGR image for the algorithm? It seems to only support greyscale right now, while openCV supports full color.
  4. Does the algorithm work asynchronously (ie, the function vpiSubmitOpticalFlowPyrLK)?
  5. Is VPIImage the same format as opencv gpuMat (rows, cols, channels in order with a pitch between rows)?
  6. It is unclear to me how to access data in VPIArray, Do you need to use VPIArrayDataRec to create a data pointer to it? Is the array just usual device memory when using cuda backend?


Update: I’m trying to wrap a cv::cuda::GpuMat with a VPIArray, but I keep getting an invalid argument error on vpiArrayCreateWrapper:

VPIArray VpiLKTrackerGPU::CreateVpiArrayViewer(cv::cuda::GpuMat &mat) {
VPIArray vpiArray;
VPIArrayData vpiArrayData;
vpiArrayData.bufferType = VPI_ARRAY_BUFFER_CUDA_AOS; = mat.cudaPtr();
switch (mat.type()) {
case CV_32FC2:
vpiArrayData.buffer.aos.type = VPI_ARRAY_TYPE_KEYPOINT_F32;
vpiArrayData.buffer.aos.strideBytes = sizeof(cv::Vec2f);
case CV_8UC1:
vpiArrayData.buffer.aos.type = VPI_ARRAY_TYPE_U8;
vpiArrayData.buffer.aos.strideBytes = 1;
default: throw std::invalid_argument();
vpiArrayData.buffer.aos.capacity = mat.rows;
vpiArrayData.buffer.aos.sizePointer = &mat.rows;

CHECK_STATUS(vpiArrayCreateWrapper(&vpiArrayData, VPI_BACKEND_CUDA, &vpiArray));
return vpiArray;


Now, the documentation for vpiArrayCreateWrapper itself is confusing as it lists VPI_ARRAY_BUFFER_CUDA_AOS as a legitimate type but the documentation itself specifies wrapping host memory, and running the code with regular cv::Mat works. The question is whether is that I’m missing something and if I’m not whether is it possible to make a VPIArray wrapper to device memory.