How to use remap() function in opencv-2.* with cuda 8.0 on Jetson TX1

Hi all,

Now I am writing a short code as follows:

int f_size = 64 * 64 * sizeof(float);
float *d_input, *d_output, *d_xmap, *d_ymap;
int value;

cudaMalloc((void **)&d_input, f_size);
cudaMalloc((void **)&d_output, f_size);
cudaMalloc((void **)&d_xmap, f_size);
cudaMalloc((void **)&d_ymap, f_size);

After that,

d_input, d_xmap, d_ymap

are assigned the correct data. And then, I have converted them into GpuMat format:

cv::gpu::GpuMat gpu_input(64, 64, CV_32FC1, d_input);
cv::gpu::GpuMat gpu_output(64, 64, CV_32FC1, d_output);
cv::gpu::GpuMat gpu_xmap(64, 64, CV_32FC1, d_xmap);
cv::gpu::GpuMat gpu_ymap(64, 64, CV_32FC1, d_ymap);

Finally, remap() function is applied:

cv::gpu::remap(gpu_input, gpu_output, gpu_xmap, gpu_ymap, cv::INTER_CUBIC, cv::BORDER_CONSTANT, value);

I have used this funtion on OpenCV lib ver-3.4.0 (cv::cuda::remap()) and it have correctly operated. However, with OpenCV < ver-3.4.0, I have reach an error as shown in the figure below:

Please help me fix this bug of input parameters of cv::gpu::remap() function.

Thank all so much!

Hi,

Could you check the return status of cudaMalloc() first?

cudaError_t status cudaMalloc(...); 
...

Thanks.

Thank for your help,

I have checked carefully data input and copy them from devive to host for showing them on the graphs. It is completely correct.

Hi,

Sorry for the late update.
Could you try to use address of gpu_input rather than gpu_input to see if helps?

Thanks.

Could you try this code and tell if it works ?

// testRemapGpu.cpp
#include <iostream>
#include <cuda_runtime.h>
#include <opencv2/opencv.hpp>
#include <opencv2/gpu/gpu.hpp>

int main()
{
    // std::cout << cv::getBuildInformation() << std::endl; 

    int f_size = 64 * 64 * sizeof(float);
    float *d_input, *d_output, *d_xmap, *d_ymap;

    if (cudaMalloc((void **)&d_input, f_size) != cudaSuccess) {
	std::cerr << "CudaMalloc failed" << std::endl;
	return -1;
    }
    if (cudaMalloc((void **)&d_output, f_size) != cudaSuccess) {
	std::cerr << "CudaMalloc failed" << std::endl;
	return -1;
    }
    if (cudaMalloc((void **)&d_xmap, f_size) != cudaSuccess) {
	std::cerr << "CudaMalloc failed" << std::endl;
	return -1;
    }
    if (cudaMalloc((void **)&d_ymap, f_size) != cudaSuccess) {
	std::cerr << "CudaMalloc failed" << std::endl;
	return -1;
    }

    cv::gpu::GpuMat gpu_input(64, 64, CV_32FC1, d_input);
    cv::gpu::GpuMat gpu_output(64, 64, CV_32FC1, d_output);
    cv::gpu::GpuMat gpu_xmap(64, 64, CV_32FC1, d_xmap);
    cv::gpu::GpuMat gpu_ymap(64, 64, CV_32FC1, d_ymap);
    cv::gpu::remap(gpu_input, gpu_output, gpu_xmap, gpu_ymap, cv::INTER_CUBIC, cv::BORDER_CONSTANT, cv::Scalar(0.f));

    std::cout << "Correctly terminated" << std::endl;
    return 0;
}

Build with (you would adjust for your case):

OPENCV_VERSION=opencv4tegra-2.4.13
CUDA_VERSION=8.0
export LD_LIBRARY_PATH=/usr/local/cuda-$CUDA_VERSION/lib64:/usr/local/$OPENCV_VERSION/lib
g++ -std=c++11 -Wall -I/usr/local/$OPENCV_VERSION/include -I/usr/local/cuda-$CUDA_VERSION/targets/aarch64-linux/include  testRemapGpu.cpp -L/usr/local/$OPENCV_VERSION/lib -lopencv_core -lopencv_gpu -L/usr/local/cuda-$CUDA_VERSION/lib64 -lcudart -o testRemapGpu
./testRemapGpu

It works for me on a TX2, using CUDA8 and opencv4tegra-2.4.13.

It seems you are running a very old version R24.
Error -217 may also be a resource outage, or may also happen if you have some code having called cudaDeviceReset() between allocation and remap.

Have you also looked at the remapped image, if it looks alright? I get some weird artifacts.

I have indeed seen this once with opencv4tegra-2.4.13 and cuda-8.0 on first run frame, but failed to reproduce later. Never seen this with opencv4/cuda9 or 10, but I was not performing very complex remappings.