Is there a way to make camera undistort (cv2 fisheye) happen faster?

frank26080115 · June 17, 2019, 6:44am

I went through the steps to calibrate my camera’s wide angle lens using cv2’s fisheye module.

I basically followed the instructions on Calibrate fisheye lens using OpenCV — part 1 | by Kenneth Jiang | Medium and then added the undistorting function into JetBot’s camera.py

Err… surprise! Frame rate dropped to crap. CPU load average went to like 5 and top said python3 started to take 125% of CPU. (power mode 10W)

Sooo… I’m hoping that since I’m on a Tegra… there’s a way to tell the GPU to help out OpenCV?

The relevant code snippet looks like this

import numpy as np
import cv2

def undistort_image(img, DIM, K, D, balance=0.0, dim2=None, dim3=None):
    dim1 = img.shape[:2][::-1]  #dim1 is the dimension of input image to un-distort
    if not dim2:
        dim2 = dim1
    if not dim3:
        dim3 = dim1
    scaled_K = K * dim1[0] / DIM[0]  # The values of K is to scale with image dimension.
    scaled_K[2][2] = 1.0  # Except that K[2][2] is always 1.0
    # This is how scaled_K, dim2 and balance are used to determine the final K used to un-distort image. OpenCV document failed to make this clear!
    new_K = cv2.fisheye.estimateNewCameraMatrixForUndistortRectify(scaled_K, D, dim2, np.eye(3), balance=balance)
    map1, map2 = cv2.fisheye.initUndistortRectifyMap(scaled_K, D, np.eye(3), new_K, dim3, cv2.CV_16SC2)
    undistorted_img = cv2.remap(img, map1, map2, interpolation=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
    return undistorted_img # this returns a np array

and calling cv2.fisheye.undistortImage(…) directly doesn’t work, I get all black, various small optimizations have been attempted but nothing gets a significant increase in frame rate

kayccc · June 19, 2019, 6:17am

Hi frank26080115,

The Tutorial you’re referring to is from 2017, we never tested it and can’t provide suggestions.
May other developer share experience if they did it before.

Thanks

arne.caspari · June 19, 2019, 4:01pm

@frank26080115

Most of the functions in your “undistort_image” function do not actually use the image data so they may be moved out of this function. Depending on how “expensive” the estimateNewCameraMatrixForUndistortRectify and “initUndistortRectifyMap” functions are, this may save a bit of performance.

The remap function will likely still take up most of the performance. It might be possible to improve performance by using cv2.INTER_NEAREST, though this will produce a lower quality image.

snarky · June 20, 2019, 3:30am

Those are SUPER EXPENSIVE and should only be done once. Once you have an undistortion map, you should save it to disk, so you can just load the map data the next time the program starts.

The map is in the format of “delta-X, delta-Y” for each pixel. The OpenCV implementation isn’t super fast, so you can actually gain some CPU cycles by implementing it in raw C with a cache-aware optimized implementation.

But, given that the map is in a simple format, you should be able to write a simple shader in OpenGL or CUDA to sample the right pixels. If the image is already available on the GPU, this is likely to run even faster. However, if you need to upload the image to the GPU to do the undistort, then that will take longer than just doing it on the CPU, at least in a reasonable C implementation. The algorithm is really quite simple, and runs fast, even using linear interpolation in the undistort map.

frank26080115 · June 20, 2019, 4:43am

Thank you! Caching map1 and map2 did the trick and the framerate is back to being good.

mdegans · June 20, 2019, 4:44am

On Tegra you can use shared GPU CPU memory and just pass a pointer around. The Jetson inference library does this to avoid the expensive copy the other dog is referring to.

You can examine how it’s done there and it works even in python. You can load an image from disk and pass a container with a pointer to some memory around and it’s faaaaast.

Found a forum thread:

https://devtalk.nvidia.com/default/topic/932957/zero-copy-and-managedmemory-%20on-jetson/?offset=2

snarky · June 20, 2019, 5:47am

do we have to sniff butts now?

mdegans · June 20, 2019, 3:43pm

Haha. Only if you want to.

Something worth mentioning from that thread is that in this zero copy mode CPU and GPU cache are disabled so you if you do end up experimenting with it, you may want to time things. In the case of dusty_nv’s Jetson inference, it’s much faster with iirc him mentioning somewhere.

Edit: found a link with some example code.
https://arrayfire.com/zero-copy-on-tegra-k1/

snarky · June 20, 2019, 5:16pm

It’s the AGP “USWC” write combined aperture come back to haunt us!

mdegans · June 20, 2019, 11:14pm

I never wrote anything for gpu during that era so I guess I missed out on how all that works. The closest I came was getting my laptop’s agp gpu to work in Linux which at the time required kernel patching since it wasn’t very well supported. It was an amd igp of some sort and not very good. My first agp desktop didn’t run accelerated X at all (4mb igp). I think I ended up using a pci video card for a while since it had no agp slots. Hard times. I ended up putting a Voodoo2 in there as well but I don’t recall if it ever worked in Linux or not. That card was fast af.

Topic		Replies	Views
tegra_multimedia_api stereo image undistort/rectify Jetson TX1	3	1718	October 18, 2021
Correct camera distortion for detectnet-camera Jetson TX2	11	4453	October 18, 2021
VisionWorks Equivalent to OpenCV gpu::remap Jetson TX2	2	838	October 18, 2021
building a cucv? computer vision CUDA Programming and Performance	27	20665	September 14, 2009
VisionWorks- Non-Linear Transformations - Undistort Jetson TX1	2	1143	June 10, 2016
[Performance] I cannot get better performance with OpenCV GPU-accelerated API. Jetson TX1	5	4075	October 18, 2021
Have you encountered this problem with functions running on CPUs, such as CV:: remap, which take too long to run? Jetson AGX Orin opencv	2	602	October 8, 2023
Stereo Disparity computation-How to increase performance in Jetson? Other Tools	0	1941	November 5, 2015
OpenCV4Tegra 2.4.8 - What's accelerated? Jetson TK1 opencv	4	3590	October 17, 2014
OpenCV Tegra optimizations needed 60 FPS Jetson TX2	3	544	October 18, 2021

Is there a way to make camera undistort (cv2 fisheye) happen faster?

Related topics