OpenCV 4.1.1 in Jetpack 4.3 is built without CUDA support

I was running some of the OpenCV4 sample apps on my Jetson Nano and they seemed to run pretty slowly… well guess what?

Turns out the OpenCV4 shipped with Jetpack 4.3 does not have CUDA support!!! :-O

I called cv::cuda::getCudaEnabledDeviceCount() and got a 0 return code, i.e. no CUDA support.

Here is my output of “pkg-config --cflags --libs opencv4”:

-I/usr/include/opencv4/opencv -I/usr/include/opencv4 -lopencv_dnn -lopencv_gapi -lopencv_highgui -lopencv_ml -lopencv_objdetect -lopencv_photo -lopencv_stitching -lopencv_video -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_videoio -lopencv_imgcodecs -lopencv_imgproc -lopencv_core

So how do I get CUDA support in my OpenCV libraries? Must I build it myself or will Nvidia fix this glaring mistake.


Hi jetsonnvidia, please see this topic:

Errrr… WTF!!!

Like the gentleman in the thread said, OpenCV being CUDA accelerated is probably one of, if not THE main applications of a Jetson board.

Okay, so if I run OpenCV algorithms under VPI, will they be CUDA accelerated?

I think my post that I linked to sums it up - the opencv_contrib CUDA code has bugs in it, it’s unmaintained, but you can build it if you wish.

Yes, you can use the GPU backend for VPI functions and they will be CUDA accelerated. There are also VPI backends for CPU and PVA (PVA is vision accelerator hw engine on AGX Xavier only). Note that the CUDA performance of VPI is being improved in the next release.

I think I am beginning to understand why VPI (and previously OpenVX) only supports certain algorithms by default. It is because as you say, many of the OpenCV algorithms have bugs so what Nvidia has done is to pick the most popular OpenCV algorithms and make them into special VPI versions (no longer OpenCV).

Thanks for the prompt reply.

Yes. Many of the OpenCV CUDA tests fail if you run them. OpenCV isn’t a bad library, but it isn’t optimized for Tegra either, where Nvidia’s solutions will give you the best performance on x86/Nvidia and Tegra. It’s not really worth it to use OpenCV unless you have a very specific reason.

Besides, even if you build OpenCV with CUDA support, you’ll still have to rewrite any CPU based OpenCV code to use the OpenCV CUDA equivalents.

Nvidia’s business plan for Jetson does not add up.

They are selling these dev kits for $100 each (clearly $100 is aimed at the hobbyist) YET the hobbyist must learn CUDA in order to actually do anything on them. Not only that but they must basically be a CUDA algorithm expert in order to make any use of them.

The managers at Nvidia need to look at their strategy and re-evaluate it because Jetson will never take off in the hobbyist arena unless they have a decent algorithm framework.

VPI is unstable.

OpenCV is not officially supported.

CUDA is all that is left.

Sort your lives out Nvidia. Your plan does not make sense.

1 Like

Sure it does. There really isn’t any competition in this area so they can do what they want, which includes recommending performant yet properitary solutions. Nothing really compares to Tegra. Even Google does their own thing with Coral, rather than bother with OpenCV.

Not really. Learning CUDA isn’t required. Nvidia provides c, c++ and even python wrappers and example code for their libraries so you can do what you want. No, they don’t bother with OpenCV, but who would want them to.

OpenCV was designed by Intel in an era where GPUs weren’t even used. 90% of it uses the CPU only. That’s fine, for some things, and Intel certainly likes it since they can’t manage to make a GPU go faster than an actual potato, but for everybody else, it’s better off dead, not that Nvidia doesn’t also make software products I would rather be eliminated (looking at you, SDK Manager).

There is DeepStream which uses the gstreamer framework if you want to analyze video. It works with python, c, and c++. It doesn’t require any knowledge of cuda and most existing gstreamer code can be ported easily to use nvidia’s accelerated components instead of the stock ones. The same is not true for CPU based OpenCV code. Hell, you can even write static pipelines in the shell with gst-launch.

For robotics, there is Isaac. For photography, there is Argus. There is MMAPI for high performance multimedia applications. All of this is apt installable, including the examples which are well documented and commented… and if you run into trouble you can ask a question here and get a prompt response from the actual developers.

That there is the real advantage. Precisely none of this requires any knowledge of CUDA itself, though you can certainly learn if you’d like. I don’t know CUDA, and haven’t had a need to learn (but I am planning on it, just because it’s cool). And if you build OpenCV with cuda support you can use the CUDA enabled functions (that aren’t broken) there as well without learning CUDA itself.

This hobbyist is personally perplexed with the obsession with OpenCV. It’s slow, and only useful on a powerful CPU, which is not the case with development boards designed for low power mobile applications. Really. I don’t get it. If you have CPU cores to burn, go for it, but otherwise, graphics were meant to be processed on a GPU, or a TPU, not a CPU. I want things to go fast, personally.

So is OpenCV’s GPU parts. Half the CUDA tests fail (if you run them) and the rest is mostly experimental and only works with OpenCL. I would kinda like support for OpenCL on Tegra, but then again, I’ve never used an OpenCL app that didn’t perform poorly compared to the CUDA version (eg Blender).

Sure it does. They’re providing performant, portable solutions and nobody else is, so they can do what they want, and I don’t blame them for wanting customers to use solutions that go faster on their thing because it’s what literally everybody does.

Thanks mdegans. There’s also the often-overlooked NVIDIA NPP library, which ships as part of CUDA Toolkit, but you don’t need to know CUDA to use it. There are many other libraries with that paradigm included in CUDA Toolkit as well. As someone who used to do a lot of CUDA kernel coding back in the day, with all that’s available I don’t often need to anymore, and when I do they are simple pre/post-processing routines for DNN’s, and the like. Actually I quite enjoy when I do need to whip one up!

VPI performance and functionality will be improved in upcoming releases, and VisionWorks is stable and optimized. And of course, you can always build OpenCV with CUDA enabled if desired.

Locking this thread since it’s run its course and it’s now the weekend :)