I can't use CUDA dnn on Jetson nano + python

Hi, this is h-kz, an student in Japan and this is my first message on this forum.
thank you for watching this message and please help me if you can

Today, I’m trying to run this python script which using OpenCV, Yolo, and DNN.
Now I’ve done this setting, but i got this error and I can’t use CUDA.

[ WARN:0] global /home/h-kz/opencv/modules/dnn/src/dnn.cpp (1363) setUpNet DNN module was not built with CUDA backend; switching to CPU

How can i solve it?

大学生のH-Kzと申します。お力をお貸しください。
現在上記URL1つ目の OpenCV, Yolo, DNNを使ったPythonを動かそうとしています。そこで上記URL2つ目の環境構築を一通りやったのですが、CUDAが使われません。
なにか原因など分かりましたら教えて下さい

The default cuDNN that is included is not built with cuDNN. Try this script like:

./build_opencv.sh master

note: as of writing this will build 4.4.0-pre from the master OpenCV branch. If you’re reading this in the future you can probably just run ./build_opencv.sh 4.4.0 (or whatever version you want). Note that 4.4.0 is the minimum version that works with cuDNN 8.0.

1 Like

thank you for advice
however, I tiried your script 2 times ,./build_opencv.sh master and ./build_opencv.sh 4.4.0 , but they couldn’t change my situation
my jetson still says setUpNet DNN module was not built with CUDA backend

Hi @h-kz
does it work if you exclude dnn, given all other parameters will be the same?
the issue seems to be a matter of specifying ./cmake arguments like cuda path & dnn path when building opencv, or when building yolo or combination of these two.

what is cudnn version at your device?

for 8.0 you would add for opencv cmake -D CUDNN_VERSION="8.0" [credits to @Honey_Patouceul for figuring the argument]

what version of Jetpadck do you use? what cmake commands do you use for building opencv? are you using preinstalled opencv? building it from sources? in the latter case which version are you building? running the script? do you purge existing pre-installed opencv before building a new one from script? Did you try installing OpenCV manually without the script?

You may also like to try developing within docker container, probably, as it will allow to use multiple different environment and various versions of tools.
So you may like to try https://developer.nvidia.com/deepstream-sdk [ for x84_64 ] or https://ngc.nvidia.com/catalog/containers/nvidia:deepstream-l4t for Jetson.

Using Yolo with DNN requires to get the Yolo build in a similar manner, probably at some point specifying libraries paths might be required.

I would suggest first to get opencv with dnn; confirm that it works; Then add yolo and see if it works.
What version of yolo do you use? what arguments do you pass to build the yolo at your environment?

Regards,
-AV

1 Like

I think cuDNN8 is installed

$ dpkg -l | grep “cudnn”
ii libcudnn8 8.0.0.180-1+cuda10.2 arm64 cuDNN runtime libraries
ii libcudnn8-dev 8.0.0.180-1+cuda10.2 arm64 cuDNN development libraries and headers
ii libcudnn8-doc 8.0.0.180-1+cuda10.2 arm64 cuDNN documents and samples
ii nvidia-container-csv-cudnn 8.0.0.180-1+cuda10.2 arm64 Jetpack CUDNN CSV file

now am trying Docker, and i&ll reply you later. Thank you for advice

Thanks @h-kz . Master should work. Can you please file an issue here with details about your setup and the output of opencv_version --verbose

@mdegans
I retried your shell-script, and i found this error

In file included from /tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/csl/cudnn.hpp:8:0,
from /tmp/build_opencv/opencv/modules/dnn/src/layers/…/op_cuda.hpp:11,
from /tmp/build_opencv/opencv/modules/dnn/src/layers/blank_layer.cpp:43:
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/primitives/…/csl/cudnn/convolution.hpp: In constructor ‘cv::dnn::cuda4dnn::csl::cudnn::ConvolutionAlgorithm::ConvolutionAlgorithm(const cv::dnn::cuda4dnn::csl::cudnn::Handle&, const cv::dnn::cuda4dnn::csl::cudnn::ConvolutionDescriptor&, const cv::dnn::cuda4dnn::csl::cudnn::FilterDescriptor&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor&)’:
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/primitives/…/csl/cudnn/convolution.hpp:266:21: error: ‘CUDNN_CONVOLUTION_FWD_PREFER_FASTEST’ was not declared in this scope
CUDNN_CONVOLUTION_FWD_PREFER_FASTEST,
^
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/csl/cudnn/cudnn.hpp:23:53: note: in definition of macro ‘CUDA4DNN_CHECK_CUDNN’
::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, FILE, LINE)
^~~~
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/primitives/…/csl/cudnn/convolution.hpp:266:21: note: suggested alternative: ‘CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3’
CUDNN_CONVOLUTION_FWD_PREFER_FASTEST,
^
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/csl/cudnn/cudnn.hpp:23:53: note: in definition of macro ‘CUDA4DNN_CHECK_CUDNN’
::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, FILE, LINE)
^~~~
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/primitives/…/csl/cudnn/transpose_convolution.hpp: In constructor ‘cv::dnn::cuda4dnn::csl::cudnn::TransposeConvolutionAlgorithm::TransposeConvolutionAlgorithm(const cv::dnn::cuda4dnn::csl::cudnn::Handle&, const cv::dnn::cuda4dnn::csl::cudnn::ConvolutionDescriptor&, const cv::dnn::cuda4dnn::csl::cudnn::FilterDescriptor&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor&)’:
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/primitives/…/csl/cudnn/transpose_convolution.hpp:42:21: error: ‘CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST’ was not declared in this scope
CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST,
^
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/csl/cudnn/cudnn.hpp:23:53: note: in definition of macro ‘CUDA4DNN_CHECK_CUDNN’
::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, FILE, LINE)
^~~~
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/primitives/…/csl/cudnn/transpose_convolution.hpp:42:21: note: suggested alternative: ‘CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT’
CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST,
^
/tmp/build_opencv/opencv/modules/dnn/src/layers/…/cuda4dnn/csl/cudnn/cudnn.hpp:23:53: note: in definition of macro ‘CUDA4DNN_CHECK_CUDNN’
::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, FILE, LINE)
^~~~
modules/dnn/CMakeFiles/opencv_dnn.dir/build.make:7134: recipe for target ‘modules/dnn/CMakeFiles/opencv_dnn.dir/src/layers/blank_layer.cpp.o’ failed
make[2]: *** [modules/dnn/CMakeFiles/opencv_dnn.dir/src/layers/blank_layer.cpp.o] Error 1
CMakeFiles/Makefile2:3737: recipe for target ‘modules/dnn/CMakeFiles/opencv_dnn.dir/all’ failed
make[1]: *** [modules/dnn/CMakeFiles/opencv_dnn.dir/all] Error 2
Makefile:162: recipe for target ‘all’ failed
make: *** [all] Error 2
Do you wish to remove temporary build files in /tmp/build_opencv ?
(Doing so may make running tests on the build later impossible)
Y/N n
Please answer yes or no.
Do you wish to remove temporary build files in /tmp/build_opencv ?
(Doing so may make running tests on the build later impossible)
Y/N n

This is output of opencv_version --verbose
and i’m going to share it to your git just now

opencv_version 4.4.0-pre opencv_version --verbose

General configuration for OpenCV 4.4.0-pre =====================================
Version control: 8a480ac

Extra modules:
Location (extra): /tmp/build_opencv/opencv_contrib/modules
Version control (extra): 5fae408

Platform:
Timestamp: 2020-07-18T01:16:25Z
Host: Linux 4.9.140-tegra aarch64
CMake: 3.10.2
CMake generator: Unix Makefiles
CMake build tool: /usr/bin/make
Configuration: RELEASE

CPU/HW features:
Baseline: NEON FP16
required: NEON
disabled: VFPV3

C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: /usr/bin/c++ (ver 7.5.0)
C++ flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG
C++ flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG
C Compiler: /usr/bin/cc
C flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG
C flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG
Linker flags (Release): -Wl,–gc-sections -Wl,–as-needed
Linker flags (Debug): -Wl,–gc-sections -Wl,–as-needed
ccache: NO
Precompiled headers: NO
Extra dependencies: m pthread cudart_static -lpthread dl rt nppc nppial nppicc nppicom nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda/lib64 -L/usr/lib/aarch64-linux-gnu
3rdparty dependencies:

OpenCV modules:
To be built: alphamat aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor ml objdetect optflow phase_unwrapping photo plot python2 python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking video videoio videostab xfeatures2d ximgproc xobjdetect xphoto
Disabled: world
Disabled by dependency: -
Unavailable: cnn_3dobj cvv hdf java js julia matlab ovis sfm ts viz
Applications: apps
Documentation: NO
Non-free algorithms: YES

GUI:
GTK+: YES (ver 3.22.30)
GThread : YES (ver 2.56.4)
GtkGlExt: NO
OpenGL support: NO
VTK support: NO

Media I/O:
ZLib: /usr/lib/aarch64-linux-gnu/libz.so (ver 1.2.11)
JPEG: /usr/lib/aarch64-linux-gnu/libjpeg.so (ver 80)
WEBP: build (ver encoder: 0x020f)
PNG: /usr/lib/aarch64-linux-gnu/libpng.so (ver 1.6.34)
TIFF: /usr/lib/aarch64-linux-gnu/libtiff.so (ver 42 / 4.0.9)
JPEG 2000: build Jasper (ver 1.900.1)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES

Video I/O:
DC1394: YES (2.2.5)
FFMPEG: YES
avcodec: YES (57.107.100)
avformat: YES (57.83.100)
avutil: YES (55.78.100)
swscale: YES (4.8.100)
avresample: YES (3.7.0)
GStreamer: YES (1.14.5)
v4l/v4l2: YES (linux/videodev2.h)

Parallel framework: pthreads

Trace: YES (with Intel ITT)

Other third-party libraries:
Lapack: YES (/usr/lib/aarch64-linux-gnu/liblapack.so /usr/lib/aarch64-linux-gnu/libcblas.so /usr/lib/aarch64-linux-gnu/libatlas.so)
Eigen: YES (ver 3.3.4)
Custom HAL: YES (carotene (ver 0.0.1))
Protobuf: build (3.5.1)

NVIDIA CUDA: YES (ver 10.2, CUFFT CUBLAS FAST_MATH)
NVIDIA GPU arch: 53 62 72
NVIDIA PTX archs:

cuDNN: YES (ver 8.0)

OpenCL: YES (no extra features)
Include path: /tmp/build_opencv/opencv/3rdparty/include/opencl/1.2
Link libraries: Dynamic load

Python 2:
Interpreter: /usr/bin/python2.7 (ver 2.7.17)
Libraries: /usr/lib/aarch64-linux-gnu/libpython2.7.so (ver 2.7.17)
numpy: /usr/lib/python2.7/dist-packages/numpy/core/include (ver 1.13.3)
install path: lib/python2.7/dist-packages/cv2/python-2.7

Python 3:
Interpreter: /usr/bin/python3 (ver 3.6.9)
Libraries: /usr/lib/aarch64-linux-gnu/libpython3.6m.so (ver 3.6.9)
numpy: /usr/lib/python3/dist-packages/numpy/core/include (ver 1.13.3)
install path: lib/python3.6/dist-packages/cv2/python-3.6

Python (for build): /usr/bin/python2.7

Java:
ant: NO
JNI: NO
Java wrappers: NO
Java tests: NO

Install to: /usr/local

I could solved this situation thanks to @mdegans 's help.
I ran his nano_build_opencv shell-script after changeing these 2 points.

  1. edit these 2 line in build_opencv.sh

111 -D CUDA_ARCH_BIN=5.3
112 -D CUDA_ARCH_PTX=5.3

  1. Specify the OpenCV version 4.4.0

./build_opencv.sh 4.3.0

Also, thak you for @Andrey1984 's advice! I’ve never used Docker, but today I want to try to make it easier with docker.

thank you everyone

Thanks for your investigation @h-kz and for filing the issue. I will try to replicate on Monday. It looks like 4.4.0 was just tagged on the OpenCV github, so I will make this the script default and push a new image to Docker Hub. Thanks also for confirming the 4.4.0 tag builds.

I’m not sure about the CUDA_ARCH stuff since I believe it should work with comma separated versions (at least according to the docs). Although the repo says “nano” in the title, I try to make sure it builds on any Tegra platform.

Note re: Docker. There are pre-built images available here which builds from the docker branch of the script repo. You might want to start from that or use the pre-built image as a base layer. Please file any issues on GitHub if you find anything broken.

1 Like