Delay with Gstreamer in OpenCV 3.4.2

Hi:
Im working on a Jetson TX2 with Ubuntu, my OpenCV version is 3.4.2 with Gstreamer support. I make a script in python an access the on-board camera with Gstreamer. The FPS I get when I run the app are as expected, but I have a 1-2 seconds of dealy. I think this is because of my gstreamer pipeline but Im not sure if it can be done in a better way. An extract of the code is:

cap = cv2.VideoCapture('nvcamerasrc ! '
               'video/x-raw(memory:NVMM), '
               'width=(int)1280, height=(int)720, '
               'format=(string)I420, framerate=(fraction)120/1 ! '
               'nvvidconv ! '
               'video/x-raw, width=(int)1280, height=(int)720, '
               'format=(string)BGRx ! '
               'videoconvert ! appsink')
    #cap = cv2.VideoCapture("test.mp4")
    #cap.set(3, 1280)
    #cap.set(4, 720)
    """out = cv2.VideoWriter(
        "output.avi", cv2.VideoWriter_fourcc(*"MJPG"), 10.0,
        (lib.network_width(netMain), lib.network_height(netMain)))"""
    print("Starting the YOLO loop...")
    while True:
        prev_time = time.time()
        ret, frame_read = cap.read()
        frame_rgb = cv2.cvtColor(frame_read, cv2.COLOR_BGR2RGB)
        frame_resized = cv2.resize(frame_rgb,
                                   (lib.network_width(netMain),
                                    lib.network_height(netMain)),
                                   interpolation=cv2.INTER_LINEAR)
        detections = detect(netMain, metaMain, frame_resized, thresh=0.25)
        image = cvDrawBoxes(detections, frame_resized)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
	cv2.imshow('video', image)
        print(1/(time.time()-prev_time))
	key = cv2.waitKey(1000/30)
	if key >= 0: # ESC key: quit program
		break
    cap.release()
    out.release()

Anyone knows what is causing the delay? The FPS I get are 10 which I think is normal for a neural network app.
Thanks for help.

Please refer to
[url]https://devtalk.nvidia.com/default/topic/1024245/jetson-tx2/opencv-3-3-and-integrated-camera-problems-/post/5210735/#5210735[/url]

The OpenCV camera frame buffer size is 10. It is hard-coded into OpenCV source code.
Your code, cv2.imshow() is 10 FPS.
Therefore, the latest frame will be displayed after 10 frames, that is, 1 second later.

There are several solutions.

Read the latest value in the background.

Use CV_CAP_PROP_BUFFERSIZE.
https://github.com/opencv/opencv/pull/11047
cap.set(cv2.CV_CAP_PROP_BUFFERSIZE, 1); // internal buffer will now store only 1 frames

Edit OpenCV source code and build.
./opencv-3.4.2/modules/videoio/src/cap_v4l.cpp
before

#define MAX_V4L_BUFFERS 10
#define DEFAULT_V4L_BUFFERS 4

after

#define MAX_V4L_BUFFERS 1
#define DEFAULT_V4L_BUFFERS 1

I have chosen a way to process in the background.

Hi DaneLLL:

I try with the pipeline code in the simple_opencv_33.cpp which is:

"nvcamerasrc ! video/x-raw(memory:NVMM), width=1280, height=720,format=NV12, framerate=30/1 ! nvvidconv ! video/x-raw,format=I420 ! appsink"

And it gives me the following error:

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(3.4.2) /home/nvidia/opencv/modules/imgproc/src/resize.cpp:4083: error: (-215:Assertion failed) src.type() == dst.type() in function 'cvResize'

Aborted (core dumped)

Should I make a transformation of the image format inside my code once I got a frame?

PD: I was able to reduce the delay to less than 1 second adding this in the end of my original pipeline:

drop=true sync=false

Thanks for the help. I will try to test the naisy solutions later.

Hi naisy:

I try with your second soltion, and I added this to my code:

cap = cv2.VideoCapture('nvcamerasrc ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)I420, framerate=(fraction)120/1 ! nvvidconv ! video/x-raw, width=(int)1280, height=(int)720, format=(string)BGRx ! videoconvert ! appsink drop=true sync=false')
    if cap.set(cv2.CAP_PROP_BUFFERSIZE, 1): 
	print("Buffer size changed")
    else:
	print("Can not change the buffer size")

The if statement is spouse to return true if it changes. But it is returning false and I dont know why. Any clue?

I have tried to modify the opencv source code as you suggest in the third option. I didnt appreciate an improvement. Do I have to do a simple make after changing the code or a make clean and then make?

Since I may process camera frames with UDP streaming, it may occur for more than 10 seconds delay.
Therefore, I’m using background processing to solve the delay problem.
I prepared a simple code. Try this.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import cv2
import threading

class WebcamVideoStream:
    """
    Reference:
    https://www.pyimagesearch.com/2015/12/21/increasing-webcam-fps-with-python-and-opencv/
    """

    def __init__(self):
        self.vid = None
        self.running = False
        return

    def __del__(self):
        if self.vid.isOpened():
            self.vid.release()
        return

    def start(self, src, width=None, height=None):
        # initialize the video camera stream and read the first frame
        self.vid = cv2.VideoCapture(src)
        if not self.vid.isOpened():
            # camera failed
            raise IOError(("Couldn't open video file or webcam."))
        if width is not None and height is not None:
            self.vid.set(cv2.CAP_PROP_FRAME_WIDTH, width)
            self.vid.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
        self.ret, self.frame = self.vid.read()
        if not self.ret:
            self.vid.release()
            raise IOError(("Couldn't open video frame."))

        # initialize the variable used to indicate if the thread should
        # check camera vid shape
        self.real_width = int(self.vid.get(3))
        self.real_height = int(self.vid.get(4))
        print("Start video stream with shape: {},{}".format(self.real_width, self.real_height))
        self.running = True

        # start the thread to read frames from the video stream
        t = threading.Thread(target=self.update, args=())
        t.setDaemon(True)
        t.start()
        return self

    def update(self):
        try:
            # keep looping infinitely until the stream is closed
            while self.running:
                # otherwise, read the next frame from the stream
                self.ret, self.frame = self.vid.read()
        except:
            import traceback
            traceback.print_exc()
            self.running = False
        finally:
            # if the thread indicator variable is set, stop the thread
            self.vid.release()
        return

    def read(self):
        # return the frame most recently read
        return self.frame

    def stop(self):
        self.running = False
        if self.vid.isOpened():
            self.vid.release()
        return

def main():
    print(cv2.__version__)

    input_src = "nvcamerasrc ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)I420, framerate=(fraction)120/1 ! nvvidconv ! video/x-raw, width=(int)1280, height=(int)720, format=(string)BGRx ! videoconvert ! appsink"

    video_reader = WebcamVideoStream()
    video_reader.start(input_src)

    try:
        while video_reader.running:
            frame = video_reader.read()
            cv2.imshow('video', frame)
            if cv2.waitKey(1) & 0xFF == 27: # ESC key: quit program
	        break
    finally:
        video_reader.stop()

if __name__ == '__main__':
    main()

Thank @naisy to share experience.

Hi Pablo,
From OpenCV 3.3, it supports I420 in appsink. Looks like you should modify the line:

frame_rgb = cv2.cvtColor(frame_read, cv2.COLOR_BGR2RGB)

to

cvtColor(frame, bgr, CV_YUV2BGR_I420);

We don’t have experience in running python, but theoretically it is same as cpp code.

Hi DaneLLL:

I modified my pipeline as you suggest and changed the line of the color conversion to:

frame_rgb = cv2.cvtColor(frame_read, cv2.COLOR_YUV2RGB_I420)

But the delay was huge like in the beginning. Then I added this parameters to the new pipeline:

format=I420 ! appsink drop=true sync=false

The huge delay disappears. Why is this happening? What are these two parameters doing? And why I cant appreciate an improvement of the performance just by avoiding the ‘videoconvert’ bottleneck?

I will try the code of naisy later, thanks all for the help.

Hi pablo,
appsink is not developed by NVIDIA. You can check
https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-libs/html/gst-plugins-base-libs-appsink.html

Also you can try max performance with jetson_clocks.sh executed.