OpenCV works slower in docker container

uzytkownik22 · October 17, 2021, 7:42pm

I’ve been trying to run a python script that requires pytorch and opencv with gpu modules. I’ve based my image on the image that I found on the jetson-inference repo, dustynv/jetson-inference:r32.4.4 that is. After running the code I noticed that I get much less fps than I got outside the container. Turns out all opencv parts of the code are painfully slow, about 10 times slower on average for CPU. I tried running just the part of my code that uses opencv directly in the container dustynv/jetson-inference:r32.4.4 but I got approximately same results.
Information about my Jetson Nano:
L4T 32.4.4 [ JetPack 4.4.1 ]
Ubuntu 18.04.05 LTS
I am new to both Docker and Jetson Nano and I didn’t manage to find any clues on the Internet. Is there something I’m missing?

AastaLLL · October 18, 2021, 3:25am

Hi,

Would you mind mounting more memory/swap to the container to see if it helps?
Sometimes the limited resources will cause the slowness.

Thanks.

uzytkownik22 · October 18, 2021, 10:00pm

Hello,
Thank you for your reply! I’ve tried mounting more memory and I still get same results as before, for both containers. I don’t have any other memory-draining processes running while testing the containers.

AastaLLL · October 21, 2021, 3:49am

Hi,

We would like to reproduce this issue in our environment.
Would you mind sharing a simple example to demonstrate the OpenCV function?

More, you can also try our l4t-ml container that has CUDA-based OpenCV installed.

Thanks.

uzytkownik22 · October 24, 2021, 11:36am

Hello,
Of course, here is what I use:

import cv2 as cv
import numpy as np

class MyClass1:
    def __init__(self, face_cascade_path):
        self.face_cascade = cv.cuda_CascadeClassifier.create(face_cascade_path)
        self.face_cascade.setMinNeighbors(5)
        self.face_cascade.setMinObjectSize((30,30))
        self.face_cascade.setScaleFactor(1.3)

    def detect_face(self, frame):
        gpu_frame = cv.cuda_GpuMat(frame)
        faces = self.face_cascade.detectMultiScale(gpu_frame).download()
        if faces is None:
            faces = np.empty(0)
        else:
            faces = faces[0]
        # non max supression is a function written by me and it works fine, does not requrie opencv. It could be omitted
        faces = non_max_suppression_fix.non_max_suppression(faces, overlapThresh=0.3)
        return faces

class MyClass2:
    def __init__(self, history, varThreshold, threshHigh, threshLow ):
        self.fgbg = cv.bgsegm.createBackgroundSubtractorMOG(history, varThreshold)
        self.threshHigh = threshHigh
        self.threshLow = threshLow

    def movement_detect(self, frame):
        height, width, _ = frame.shape
        fgmask = self.fgbg.apply(frame)
        nonZero = cv.countNonZero(fgmask)
        percent = nonZero / (height * width) * 100
        return percent

    def edge_detect(self, frame):
        frameCanny = cv.Canny(frame, self.threshLow, self.threshHigh)
        _, frameBin = cv.threshold(frameCanny, 100, 255, cv.THRESH_BINARY)
        frameDyl = cv.dilate(frameBin, cv.getStructuringElement(cv.MORPH_ELLIPSE, (5, 5)))
        return frameDyl

For haar cascade I use haarcascade_frontalface_default.xml that can be found on github.
I’ve build a container with opencv installed, but without the gpu module. There, just for the code above (without the haar cascade) I got same results as outside the container. Could it be an issue with cuda?

AastaLLL · November 5, 2021, 3:43am

Hi,

The above source is using cuda_GpuMat.
So you will need an OpenCV with GPU support.

Would you mind testing the sample with the l4t-ml container to see if the same issue occurs?

Thanks.

uzytkownik22 · November 13, 2021, 8:26pm

I did and I still have the same issue.

AastaLLL · November 25, 2021, 6:13am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

Could you also try to reproduce this on our latest JetPack4.6 (r32.6.1)?

More, do you have some performance data of the inside/outside docker use case?
This can help us know if we can reproduce this issue locally.

Thanks.

system · December 22, 2021, 2:07am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Too slow OPENCV with CUDA compiled, why? Jetson Nano opencv	5	4791	October 18, 2021
Very poor Performance with with NVIDIA Jetson Nano 2GB in Face Recognition Jetson Nano python	7	3309	March 28, 2022
Haar cascade with cuda xml classifier doesn't work Jetson Nano opencv	6	2870	June 9, 2021
Camera doesn't open in docker Jetson Nano camera , opencv , docker	5	1050	July 26, 2022
IP camera connection with jetson.utils.gstCamera Jetson Nano camera	14	2560	October 15, 2021
Opencv Face Detection Poor Performance with jetson nano Jetson Nano opencv	51	14177	October 14, 2021
Jetson Nano OpenCV latency Problem Jetson Nano opencv , python	4	1777	October 15, 2021
Jetson-inference docker file Jetson TX2 docker	14	1917	October 18, 2021
Very bad performance, or is something wrong? Jetson Nano	10	961	April 30, 2023
Access CSI in Docker container Jetson Xavier NX camera , gstreamer , docker	10	2034	October 18, 2021

OpenCV works slower in docker container

Related topics