OpenCV VideoCapture of RAW data

Hello,

I want to process my raw image data with Python OpenCV, but I no longer know how to proceed

I can save raw iamges

v4l2-ctl -d /dev/video0 --set-fmt-video=width=3856,height=2176,pixelformat=RG12 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1 --stream-to=test.raw

I can display the preprocessed video stream

gst-launch-1.0 nvarguscamerasrc ! 'video/x-raw(memory:NVMM),width=3856,height=2176,framerate=30/1,format=NV12' ! queue ! nv3dsink

and I can display the preprocessed video stream in Python with OpenCV

cap = cv2.VideoCapture("nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)3856, height=(int)2176,format=(string)NV12, framerate=(fraction)30/1 ! nvvidconv ! video/x-raw, format=(string)BGRx ! videoconvert !  appsink")

if cap.isOpened():
        cv2.namedWindow("demo", cv2.WINDOW_AUTOSIZE)
        while True:
            ret_val, img = cap.read();
            cv2.imshow('demo',img)
            cv2.waitKey(10)

but how do I get access to the raw data to display it? (without using the SW ISP)

I thought about using v4l2src, but whenever I tried a command, it only showed:

$ gst-launch-1.0 -v v4l2src device="/dev/video0" ! "video/x-raw, width=3856, height=2176, format=(string)NV12" ! fakesink
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
ERROR: from element /GstPipeline:pipeline0/GstV4l2Src:v4l2src0: Internal data stream error.
Additional debug info:
gstbasesrc.c(3072): gst_base_src_loop (): /GstPipeline:pipeline0/GstV4l2Src:v4l2src0:
streaming stopped, reason not-negotiated (-4)
Execution ended after 0:00:00.000068064
Setting pipeline to NULL ...
Freeing pipeline ...
ioctl: VIDIOC_ENUM_FMT
	Type: Video Capture

	[0]: 'RG12' (12-bit Bayer RGRG/GBGB)

I would be very grateful for any advice

Hi,
Using nvarguscamerasrc can use hardware ISP engine. If you don’t go this route, would need to implement debayering function.

A user has shared a solution. You can check and see if it can be applied to your use-case:
Gstreamer with GPU implementation of bayer2rgb - #4 by thanhnha

Hi,

I was not able to get it to work with the “tiscamera”, could it be that it is only for USB and Ethernet cameras (I forgot to say that I use a MIPI CSI2 camera with serialiser and deserialiser)

Nevertheless I found two other solutions

first solution directly in the terminal

v4l2-ctl -d /dev/video0 --set-ctrl=bypass_mode=0, --set-fmt-video=width=3856,height=2176,pixelformat=RG12 --stream-mmap --stream-to=- | gst-launch-1.0 filesrc location=/dev/stdin blocksize=$(expr 3872 * 2176 * 2) ! ‘video/x-raw,format=GRAY16_LE,width=3872,height=2176,framerate=30/1’ ! queue ! videoconvert ! xvimagesink

second solution with Python
(Python v4l2 webcam capture test using PlayStation 3 camera. More advanced script can be found here: https://github.com/eik-robo/zoidberg/blob/master/examples/purepy_video_capture.py · GitHub)

#!/usr/bin/env python3

from v4l2 import * # v4l2-python3
import fcntl
import mmap
import select
import time
import struct
import os
import cv2
import numpy as np

os.system('v4l2-ctl --device /dev/video0 --set-ctrl preferred_stride=7744')

vd = os.open('/dev/video0', os.O_RDWR, 0)

print(">> get device capabilities")
cp = v4l2_capability()
fcntl.ioctl(vd, VIDIOC_QUERYCAP, cp)

print("Driver:", "".join((chr(c) for c in cp.driver)))
print("Name:", "".join((chr(c) for c in cp.card)))
print("Is a video capture device?", bool(cp.capabilities & V4L2_CAP_VIDEO_CAPTURE))
print("Supports read() call?", bool(cp.capabilities &  V4L2_CAP_READWRITE))
print("Supports streaming?", bool(cp.capabilities & V4L2_CAP_STREAMING))

print(">> device setup")
fmt = v4l2_format()
fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE
fcntl.ioctl(vd, VIDIOC_G_FMT, fmt)  # get current settings
print("width:", fmt.fmt.pix.width, "height", fmt.fmt.pix.height)
print("pxfmt:", "V4L2_PIX_FMT_YUYV" if fmt.fmt.pix.pixelformat == V4L2_PIX_FMT_YUYV else fmt.fmt.pix.pixelformat)
print("bytesperline:", fmt.fmt.pix.bytesperline)
print("sizeimage:", fmt.fmt.pix.sizeimage)
fcntl.ioctl(vd, VIDIOC_S_FMT, fmt)  # set whatever default settings we got before

print(">> init mmap capture")
req = v4l2_requestbuffers()
req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE
req.memory = V4L2_MEMORY_MMAP
req.count = 1  # nr of buffer frames
fcntl.ioctl(vd, VIDIOC_REQBUFS, req)  # tell the driver that we want some buffers 
print("req.count", req.count)

buffers = []

print(">>> VIDIOC_QUERYBUF, mmap, VIDIOC_QBUF")
for ind in range(req.count):
    # setup a buffer
    buf = v4l2_buffer()
    buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE
    buf.memory = V4L2_MEMORY_MMAP
    buf.index = ind
    fcntl.ioctl(vd, VIDIOC_QUERYBUF, buf)

    buf.buffer =  mmap.mmap(vd, buf.length, mmap.MAP_SHARED, mmap.PROT_READ | mmap.PROT_WRITE, offset=buf.m.offset)
    print("no {} buf {}",ind,repr(buf))
    buffers.append(buf)

    # queue the buffer for capture
    fcntl.ioctl(vd, VIDIOC_QBUF, buf)

print(">> Start streaming")
buf_type = v4l2_buf_type(V4L2_BUF_TYPE_VIDEO_CAPTURE)
fcntl.ioctl(vd, VIDIOC_STREAMON, struct.pack('I', V4L2_BUF_TYPE_VIDEO_CAPTURE))#buf_type)

print(">> Capture image")
t0 = time.time()
max_t = 1
ready_to_read, ready_to_write, in_error = ([], [], [])
print(">>> select")
while len(ready_to_read) == 0 and time.time() - t0 < max_t:
    ready_to_read, ready_to_write, in_error = select.select([vd], [], [], max_t)

print(">>> download buffers")

for _ in range(1000):
    buf = buffers[_ % req.count]
    fcntl.ioctl(vd, VIDIOC_DQBUF, buf)  # get image from the driver queue
    mm = buffers[buf.index].buffer
    cv2.imshow('frame', np.reshape(np.frombuffer(mm, dtype=np.uint16, count=3872*2176), (2176,3872)))
    cv2.waitKey(1)
    fcntl.ioctl(vd, VIDIOC_QBUF, buf)  # requeue the buffer

print(">> Stop streaming")
fcntl.ioctl(vd, VIDIOC_STREAMOFF, buf_type)
vd.close()

The only problem is that I can’t get the full 30fps with my camera.
Do you know anything I could improve?

Hi,
It is possible it cannot achieve target performance if debayering is done on CPU. You can execute sudo jetson_clocks to run CPU cores at maximum clock, to get optimal throughput.

Hi,
yes, but I don’t do any debayering yet, currently I have the images displayed as a raw (grayscale) image.
Debayering would be the next step

Edit: At least in the Python code it’s because of the resolution, if I resize the image to e.g. 968x544, it’s good

Hi,
You can run sudo tegrastats to get system loading and check where the bottleneck is. Because hardware ISP engine is not used, it looks like CPU capability dominates the performance.

Hi,

if I run the Python script without resizing the image (8 MPix), the CPU load (1 core) increases to 80% (= low frame rate)
if I resize the image to 0.5 MPix before displaying it, the CPU load is about 15% (= normal frame rate)
So it seems that imshow() in Python OpenCV is the bottleneck, which is probably quite normal when the CPU is used to display 8 MPix

Do you think opening the window with OpenGL will improve it? (I can’t test it at the moment because opencv was built without OpenGL support)

But what is the bottleneck with this command (1 core is at 100%)

v4l2-ctl -d /dev/video0 --set-ctrl=bypass_mode=0, --set-fmt-video=width=3856,height=2176,pixelformat=RG12 --stream-mmap --stream-to=- | gst-launch-1.0 filesrc location=/dev/stdin blocksize=$(expr 3872 * 2176 * 2) ! ‘video/x-raw,format=GRAY16_LE,width=3872,height=2176,framerate=30/1’ ! queue ! videoconvert ! xvimagesink

is it due to the I/O operations (–stream-to=- … filesrc location=/dev/stdin)
or is it due to the GRAY16 conversion (even if there should be no conversion, because according to my idea only the raw data should be displayed, i.e. each pixel with the corresponding brightness information)?

Hi,
If you would like to get frame data in RAW format, the execution will be on CPU cores and no hardware blocks can be used. It is expected to see high CPU usage and performance is capped by CPU capability. We would suggest use ISP engine to get YUV420 for further processing.

Hi,
but I have a 12-bit RCCG Bayer sensor, so I have to do it via SW, or is there another way?

Hi,
You are correct that our camera hardware blocks do not support 12-bit RCCG format. So you would need to do it in software. Or capture the frame data into CUDA buffer and implement CUDA code. This can shift the loading from CPU to GPU.

For capturing frames into CUDA buffer, please try the sample:

/usr/src/jetson_multimedia_api/samples/18_v4l2_camera_cuda_rgb/

If you can successfully capture frame data by running the sample, you can then implement CUDA code for debayering.

Hi,

ok, thanks for the input
I’m busy with some other things at the moment, but as soon as I get it working I’ll post the solution here