Best option(s) to decode camera mjpg frame in python

I am developing a python application on Jetson Nano.
Four 8MP (3264 x 2448) cameras are connected through USB 3.0 hub.
I can not configure all 4 cameras to stream YUYV frames as they can not fit in the USB 3.0 bus at the possible fps.
So, I need to configure and get frames in MJPG compress format and uncompress in the Nano for further processing (TensorRT model evluation, etc.)
Since I only need to convert (MJPG to BRG) some of the frames (around one every 2 second per camera), I don’t see gstreamer in OpenCV as an option.
Currently I only see Opencv “imdecode” function as a valid alternative. It takes around 114 ms per 8MP frame. Even I compile OpenCV with CUDA support It seems only consume CPU power
Jetson Linux Multimedia API offers a solution that use GPU for decoding (NvJPEGDecoder). I have not tested but I guess it will be faster and will use GPU and less CPU resources.
I wonder if there is a Python API (wrapper) for Multimedia API.
Any other option for MJPG decoding in Nano?

Hey,

If you are using the L4T distro you should have access to gstreamer within openCV. I just worked with a project using a camera with MJPEG.

If your cameras are using the V4L2 api, then in openCV, you could create a VideoCapture object using something like the following:

cv2.VideoCapture("v4l2src device=<your-video-device> io-mode=2 ! image/jpeg,width=<cam-capture-width>,height=<cam-capture-height>,framerate=<cam-capture-fps>/1 ! jpegparse ! nvv4l2decoder mjpeg=1 ! nvvidconv ! videorate ! video/x-raw,format=BGRx,framerate=<custom-framerate-for-app>/1 ! videoconvert ! video/x-raw,width=<scaled-image-width>,height=<scaled-image-height>,format=BGR ! appsink")

You may not need all elements of this pipeline, but we were capturing from a camera and pre-scaling the image down to make the processing easier. You can limit the framerate down as well.

One word of caution is that you need to use v4l2-ctl to pre-set the capture resolution and framerate and these need to match with the gstreamer pipeline.

Im new to gstreamer and it’s an insanely complex but powerful tool.

Hope this helps!

Thanks bhanner for your response,

I am using a cython wrapper to V4L2 libraries for cameras management. But I also can use opencv with Gstreamer as you propose.
The problem I see with Gstreamer is that you can not decide asynchronously which frame to convert and which not.
In my application I am synchronizing the frame analysis with a motor movement. The plan is to collect and analyze the data from camera board microphone to detect when the motor is stopped and take the camera frames with timestamp close after the motor stop. I only need to convert MJPG to BGR those frames (one per camera). And I need at full resolution to detect small artifacts in the image.

I am testing anyway your gstreamer string with OpenCV but I get an error:

Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 277
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 277
nvbuf_utils: Invalid memsize=0
NvBufferCreateEx with memtag 5376 failed
[ WARN:0] global /home/printai/opencv/modules/videoio/src/cap_gstreamer.cpp (1757) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module nvv4l2decoder0 reported: Failed to allocate required memory.
[ WARN:0] global /home/printai/opencv/modules/videoio/src/cap_gstreamer.cpp (886) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global /home/printai/opencv/modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created

Memory availability should not be a problem as it is the only program running for one camera
Did you find this issue?

Interesting… I never had an issue with being able to allocate memory, but I am also only using a 720p camera so maybe the larger frames have something to do with it.

Can you post exactly what your string looks like?

Hi bhanner,

I did not change anything in your string, just set the parameters with no subsampling. Here it is:
“v4l2src device=/dev/video0 io-mode=2 ! image/jpeg, width=3264,height=2448,framerate=20/1 ! jpegparse ! nvv4l2decoder mjpeg=1 ! nvvidconv ! videorate ! video/x-raw,format=BGRx,framerate=20/1 ! videoconvert ! video/x-raw,width=3264,height=2448,format=BGR ! appsink”

I found similar issue:


Where they propose a patch:
https://forums.developer.nvidia.com/uploads/short-url/4G5XLdepLt247pmBbLTp2rIw2n.zip

I replace /usr/lib/aarch64-linux-gnu/tegra/libnvtvmr.so with the file in the patch, but I get same error.
Not sure if I need to recompile OpenCV to effectively apply the patch.

I had to replace libnvtvmr.so as well, but my reason was because of color space issues. You shouldnt have to recompile to use it. Have you made sure to run v4l2-ctl’s --set-fmt-video command to select the resolution and framerate from the cam?

v4l2-ctl --set-fmt-video=width=<your-desired-width>,height=<your-desired-height>,pixelformat=<your-desired-format>

v4l2-ctl --set-parm=<your-framerate>

you should see v4l2 confirm that a framerate was set

Hi,

Yes, I configured the camera before running python script using v4l2-ctl from command line as you suggested.
I verified again. Same error

Hi @pedraza.salvador

I asked the question on the post that you referred to. You definitely do not have to recompile anything. Maybe you need a reboot, but nothing else should be needed.

One thing I found out when trying out the cameras was that if you wanted to use more than two USB cameras, they had to be USB3.0 cameras. I see that you are using a USB3.0 hub, but the cameras might also need to be USB3.0. I just recall that I had some sort of memory allocation error when trying to use more than two USB2.0 cameras.

I am also using four cameras on my setup, but I use two USB2.0 cameras and two MIPI CSI cameras. The gstreamer pipeline is different for the MIPI CSI cameras, but they work pretty well for that. This post has my current successes with that setup if this is something you can do.

I’m pretty sure I saw something too about USB hubs needing to be separately powered. But that’s only an issue if your cameras suck too much power which I don’t think is your issue.

I wish I could find the post about four USB2.0 cameras not working, but I can’t seem to find it. But I definitely that I had to put a limit of 2 USB2.0 cameras. I never got to verify if 4 USB3.0 cameras would work though as using the MIPI CSI cameras was fine for me.

Hi @warpstar22,

Thanks for the info. I restarted the Nano but still same problem.

Yes, I use USB 3.0 cameras, otherwise I can not accomodate 4 8MP cameras. Power is not a problem as they consume very low power.
The 4 cameras is working fine but I can not configure all of them to YUYV format because they can not fit in USB3.0 bandwidth. That is because I need to convert efficiently MJPG frames to RBG.

Did you try what I said about using I420:

I see that you are using videoconvert and that is extremely slow. It’s slow because it copies from GPU memory to CPU memory. This will clog up the gstreamer queue. Since I was going for instantaneous captures, I had to process 7-12 frames. That’s a lot waiting in the queue. If you are using OpenCV, your pipeline would look like the following:

And then you use the following to convert to BGR as that is the required format for processing in OpenCV:
cv::cvtColor(image, _image, cv::COLOR_YUV2BGR_I420)

Also, I would recommend you use cv::VideoCapture::grab(), as it says:

The primary use of the function is in multi-camera environments, especially when the cameras do not have hardware synchronization. That is, you call VideoCapture::grab() for each camera and after that call the slower method VideoCapture::retrieve() to decode and get frame from each camera. This way the overhead on demosaicing or motion jpeg decompression etc. is eliminated and the retrieved frames from different cameras will be closer in time.

Also, you may have to use 1080p rather than 4k resolutions. That would also clear up memory.

Although @warpstar22’s advice is good, be aware that having opencv appsink through opencvio may not be efficient on jetson. You may get much better performance with jetson-utils.

Not tried with python, but you may give a try to last version of jetson-utils supporting mjpg sources.

You may have a look to this topic or detectnet.py for a python starting point.

Hi @warpstar22,

Thanks for your proposal. I did not try I420 before.
Your Gstreamer pipeline works except an error with flip:
open OpenCV | GStreamer warning: Error opening bin: could not set property “flip-method” in element “nvvconv0” to “M”
Removing that part works fine.
And then convert to BRG with cvtColor also works fine.

However Looking into jtop monitor, MJPG decoding to I420 seems to be done in CPU, no use of GPU or Dedicated decoding HW.

Reduce camera resolution (4K to 1K) is not an option for me. I need full resolution for my application.

I will investigate further this option.

1 Like

Hi @Honey_Patouceul,

Thanks for your tips.
jetson.utils seems to embed JPEG decoding into camera capture functions.
However I need a separated function that convert a frame in memory from JPEG to BGR (or any other uncompress format) to be called from python.

Camera frame capture is done in my application with a python wrapper to V4L2 linux library.
With this library I can collect any frame at any specific time-stamp.
What I am missing is an efficient way to convert desired frames from JPEG (provided by the camera) to BGR that use GPU and/or dedicated HW.

Any additional advise?

Hi,
Please check if software decoder jpgdec works:

v4l2src device=/dev/video0 io-mode=2 ! image/jpeg, width=3264,height=2448,framerate=20/1 ! jpegparse ! jpegdec ! videoconvert ! video/x-raw,width=3264,height=2448,format=BGR ! appsink

Hi @pedraza.salvador,

In my pipeline, I meant that flip-method=M should be where M (“method”) is a value 0 to 8 as mentioned in the documentation. It just reorients the image and isn’t necessary.

Glad to hear the pipeline works otherwise.

Hi @warpstar22,

Sorry, that makes sense.

Thanks

1 Like

Finally I resolve my issue doing a cython wrap to NvJPEGDecoder (part of Jetson Multimedia Imaging API).
I also modify the cpp libraries to use CUDA compatible buffers in NvBuffer used in decoding call.
With this solution takes around 50 ms to decode a 8MP JPEG image vs the 114 ms that it takes using OpenCV