vxSetRemapPoint operation very slow for large remaps

infontz6r · January 13, 2019, 10:20pm

Hi,

I want to fill a large remap (typically 4k image size) with the remap points. To do this you need to use the vxSetRemapPoint function. This takes around 1.5 seconds on the Jetson TX2. Is there a faster method for filling the points (e.g. by functions like vxCopyRemapPatch and vxMapRemapPatch)?

The calculation of the warp x- and y-maps (using Gpu accelerated OpenCV functions) takes around 170 ms on the Jetson TX2.

This is a code snippet for filling the remap (input_width_ = 3840, input_height_ = 2160):

void CreateRemapTable()
{
    vx_remap_ = vxCreateRemap(vx_context_, input_width_, input_height_, output_width_, output_height_);
    // Initialize the remap table.
    for (int dst_y = 0;dst_y < output_height_;dst_y++) {
        for (int dst_x = 0;dst_x < output_width_;dst_x++) {
            int src_x = dst_x + crop_left;
            int src_y = dst_y + crop_top;
            vxSetRemapPoint(vx_remap_, dst_x, dst_y, (vx_float32)x_warp_map_->at<float>(src_y, src_x), (vx_float32)y_warp_map_->at<float>(src_y, src_x));
        }
    }
}

AastaLLL · January 14, 2019, 1:49am

Hi,

Could you shared more detail about your use case?

If you just want to shift the image with a constant offset, you don’t need to fill the map table each time.
This is a point level read/write command and expected to be slow.

It’s recommended to check the homogeneous transformation implementation in the visionworks:
{VisionWorks folder}/3rdparty/eigen/Eigen/src/Eigen2Support/Geometry/Transform.h

Thanks.

infontz6r · January 14, 2019, 8:01am

Hi,

The use case is a virtual camera PTZ system. So first I need to cylindrically warp the 4k camera images on a cylinder and after that warp part of the cylindrically warped camera image back on an output image. The remap matrix for this warp changes a lot since the vPTZ camera is changing its angles (and thus the rotation matrix) multiple times per second.

Regards,
Boris.

AastaLLL · January 21, 2019, 5:24am

Hi,

An alternative is to use homogeneous transform.

Is it possible to approximate your warping matrix to be a 3x4 transform matrix?
Or you need a pixel level warping accuracy for your use case?

Thanks.

infontz6r · January 21, 2019, 8:02am

Hi,

No unfortunately I cannot use a homogeneous transform, since I need to stitch the 2 input camera images accurately. They’re placed under an angle of approximately 90 degrees. A cylindrical (or spherical) warp is needed to do this.

AastaLLL · January 25, 2019, 7:30am

Hi,

But I think you can create these table at the beginning.
Suppose there are the limited degrees to map.

Thanks.

infontz6r · January 25, 2019, 7:46am

I am creating the tables for the 2 fixed cameras at the beginning as you are suggesting, but the actual output of the system is a “virtual” camera that can pan, tilt and zoom. During the movement of the camera (e.g. panning) the matrix for the destination warp has to change every frame (e.g. 30 times per second).

AastaLLL · February 11, 2019, 9:20am

Hi,

Sorry for the late reply.

The API is not suitable for your use case.
Maybe you can try to implement a CUDA kernel for your use case.

Or try our NPP library: https://docs.nvidia.com/cuda/npp/group__image__remap.html

Thanks.