About using dnn module of opencv


When I ran the example using the following opencv dnn module on jetson nano, cpu idle was 11.

Is this originally like this?
Is there any way to increase the performance?

#include “opencv2/opencv.hpp”


using namespace cv;

using namespace cv::dnn;

using namespace std;

// const String model = “res10_300x300_ssd_iter_140000_fp16.caffemodel”;

// const String config = “deploy.prototxt”;

const String model = “opencv_face_detector_uint8.pb”;

const String config = “opencv_face_detector.pbtxt”;

int main(void)


const std::string videoStreamAddress = "rtsp://root:root@";

VideoCapture cap(videoStreamAddress);

if (!cap.isOpened()) {

    cerr << "Camera open failed!" << endl;

    return -1;


Net net = readNet(model, config);

if (net.empty()) {

    cerr << "Net open failed!" << endl;

    return -1;


Mat frame;

while (true) {

    cap >> frame;

    if (frame.empty()) 


    Mat blob = blobFromImage(frame, 1, Size(300, 300), Scalar(104, 177, 123));


    Mat res = net.forward();

    Mat detect(res.size[2], res.size[3], CV_32FC1, res.ptr<float>());

    for (int i = 0; i < detect.rows; i++) {

        float confidence = detect.at<float>(i, 2);

        if (confidence < 0.5) 


        int x1 = cvRound(detect.at<float>(i, 3) * frame.cols);

        int y1 = cvRound(detect.at<float>(i, 4) * frame.rows);

        int x2 = cvRound(detect.at<float>(i, 5) * frame.cols);

        int y2 = cvRound(detect.at<float>(i, 6) * frame.rows);

        rectangle(frame, Rect(Point(x1, y1), Point(x2, y2)), Scalar(0, 255, 0));

        String label = format("Face: %4.3f", confidence);

        putText(frame, label, Point(x1, y1 - 1), FONT_HERSHEY_SIMPLEX, 0.8, Scalar(0, 255, 0));


    imshow("frame", frame);

    if (waitKey(1) == 27) 



return 0;


Thank you.


By default, OpenCV use ffmpeg for video input which is an CPU based decoder.
To improve this, you can try to use GStreamer for input first.

If OpenCV is not essential, it’s recommended to use our Deepstream SDK.
It has optimized the camera pipeline as well as the inference based on TensorRT.


1 Like


The input you are talking about is cv::VideoCaputre(input, …);
Are you talking about?
Am I telling you to use gstreamer instead of rtsp here?

Thank you.


AFAIK, OpenCV VideoCaputre use ffmpeg for input decoding.
Please correct me if this is not true currently.

Our suggestion is to use GStreamer instead of ffmpeg to get the acceleration for GPU decoding.
It will look like this:

uri = "rtsp://root:root@"    
gst_str = ("rtspsrc location={} latency={} ! rtph264depay ! h264parse ! omxh264dec ! nvvidconv ! video/x-raw, width=(int){}, height=(int){}, format=(string)BGRx ! videoconvert ! appsink sync=false").format(uri, rtsp_latency, image_width, image_height)         
cap = cv2.VideoCapture(gst_str)