Jetpack 4.4 Broke one of my programs

@vondalej

It looks like master does build and cuda support is available. I did not test the cuDNN module but it did build. If you use it, please report any issues you find on github. The docker image has just been pushed. You may:

sudo docker run -it --rm --runtime nvidia mdegans/tegra-opencv:jp-r32.4.3-cv-master

(it’s also the “latest” tag)
and within the container, run

root@c0a37a2a0bd4:/usr/local/src/build_opencv# python3
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.cuda.printCudaDeviceInfo(0)
*** CUDA Device Query (Runtime API) version (CUDART static linking) *** 

Device count: 1

Device 0: "Xavier"
  CUDA Driver Version / Runtime Version          10.20 / 10.20
  CUDA Capability Major/Minor version number:    7.2
  Total amount of global memory:                 7764 MBytes (8140648448 bytes)
  GPU Clock Speed:                               1.11 GHz
  Max Texture Dimension Size (x,y,z)             1D=(131072), 2D=(131072,65536), 3D=(16384,16384,16384)
  Max Layered Texture Size (dim) x layers        1D=(32768) x 2048, 2D=(32768,32768) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and execution:                 Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           0 / 0
  Compute Mode:
      Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) 

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 10.20, CUDA Runtime Version = 10.20, NumDevs = 1

Note: to use the gpu you do not need root, but your user needs to be mapped to or in the “video” group.

If you wish to build it yourself ./build_opencv.sh master should work fine with no modifications to the script itself.

In the latest JetPack4.4 , opencv4.3 would not be build successfully.
Please note that in Jetpack4.4DP release , opencv4.3 can be build by specifying the cudnn version as 8.0. But in Jetpack4.4 ( not DP version ) it would show error in around 50%.

This is due to
Add CuDNN 8 release support #17496
https://github.com/opencv/opencv/issues/17496

And the fix has been commited as
cuda4dnn(build): add basic support for cuDNN 8 #17685
https://github.com/opencv/opencv/issues/17685

You can put the same modification that commited as 17685 to 3 files in opencv4.3.
https://github.com/opencv/opencv/pull/17685/commits/62a63021c7cbe44ff429cb5a6f14fbb1485a6c39

Then Opencv4.3 can be build with Jetpack4.3 without issue with CUDNN enabled.

@sowd0726

Thanks for the notes. It’s what @dusty_nv suggested, actually, but since master builds, i just suggested people use that rather than modify the script itself. You can specify a branch as the first (optional) parameter like build_opencv.sh master. The repos can be edited here if you want to point it to another fork.

I did have to make some changes to the docker branch since the base image in 4.4 GA has had significant modifications, but the script in master should work as is without modification so long as a working tag/branch is suplied like in the above usage.

@mdegans

I ran the command below, and it worked great! I was unable to containerize OpenCV until I found this.

sudo docker run -it --rm --runtime nvidia mdegans/tegra-opencv:jp-r32.4.3-cv-master

However, I am trying to distribute a custom container from this image through Azure IoT edge, and It is not working. I believe that it is because Azure IoT Edge does not specify the NVIDIA runtime.

Do you know a workaround for this?