Jetson Nano with gstreamer and opencv - High overhead

Hi,

For a project I want to stream video from a MP1110m-vc camera with USB interface to a Jetson Nano, to then be processed with OpenCV in python.

I am currently using the following Gstreamer pipeline with OpenCV:

“v4l2src device=/dev/video2 ! video/x-raw, width=(int)1920, height=(int)1080, format=(string)YUY2, framerate=(fraction)30/1
! videoconvert ! video/x-raw, format=(string)BGR ! appsink max-buffers=1 drop=true”

While this works, it uses around 125% CPU just for streaming 1080p video to OpenCV and display it.

Is there a way to optimise this, or a more efficient way to do it?

Hi,
This is the solution on Jetson platforms. Please refer to discussion in

1 Like

You may save some CPU usage using nvvidconv to perform YUY2 to BGRx and use videoconvert just for BGRx to BGR. It may would however add some copy and may increase latency.
You would try:

"v4l2src device=/dev/video2 ! video/x-raw, width=(int)1920, height=(int)1080, format=(string)YUY2, framerate=(fraction)30/1 ! nvvidconv ! video/x-raw(memory:NVMM) ! nvvidconv ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink max-buffers=1 drop=true"
1 Like

Thank you for the suggestions.

The pipeline suggested by @Honey_Patouceul decreases the overhead slightly, but it is still pretty high.

I tried to match some of the suggestions from the thread mentioned by @DaneLLL, but all failed. I am pretty new to Gstreamer, so not sure why they dont work:

“v4l2src device=/dev/video2 ! video/x-raw, width=(int)1920, height=(int)1080, format=(string)YUY2, framerate=(fraction)30/1 ! nvvidconv ! video/x-raw, format=(string)I420 ! appsink max-buffers=1 drop=true”

and

“v4l2src device=/dev/video2 ! video/x-raw, width=(int)1920, height=(int)1080, format=(string)YUY2, framerate=(fraction)30/1 ! nvvidconv ! video/x-raw, format=(string)RGBA ! appsink max-buffers=1 drop=true”

and

“v4l2src device=/dev/video2 ! video/x-raw, width=(int)1920, height=(int)1080, format=(string)YUY2, framerate=(fraction)30/1 ! appsink max-buffers=1 drop=true”

all fail to start. Not sure if this would even give a performance improvement.

[UPDATE]: Actually, the last pipeline works, and reduces the overhead to around 75% (with the added cv2.cvtColor(img, YUV2BGR_YUY2)). I am open to further suggestions, but this is at least a lot better than 125%

If it gets better with your last pipeline, you may try V4L API instead:

cap = cv2.VideoCapture(2, cv2.CAP_V4L)   # Open /dev/video2 with V4L API

You may also have to check if the RGB conversion is done automatically :

conv = cap.get(cv2.CAP_PROP_CONVERT_RGB)

…but it may also depend on what is your processing loop.

1 Like

Thanks, that pipeline achieves a very similar overhead of around 75%, but with less code.

75% is acceptable for my application at this point. Thanks a lot for the help!

1 Like