How to decode raw images saved with v4l-ctl?

Board: Nvidia Jetson Xavier NX
L4T version: 32.5.1

Hello, I have saved a raw image with the following command:

v4l2-ctl -d /dev/video2 --set-fmt-video=width=1920,height=1080,pixelformat=RG12 --stream-count=1 --stream-mmap --stream-to=dev2 --verbose

Here is the ouput:

--stream-to=dev2 --verbose
VIDIOC_QUERYCAP: ok
VIDIOC_G_FMT: ok
VIDIOC_S_FMT: ok
Format Video Capture:
	Width/Height      : 1920/1080
	Pixel Format      : 'RG12'
	Field             : None
	Bytes per Line    : 3840
	Size Image        : 4147200
	Colorspace        : sRGB
	Transfer Function : Default (maps to sRGB)
	YCbCr/HSV Encoding: Default (maps to ITU-R 601)
	Quantization      : Default (maps to Full Range)
	Flags             : 
VIDIOC_REQBUFS: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_STREAMON: ok
	Index    : 1
	Type     : Video Capture
	Flags    : mapped
	Field    : None
	Sequence : 1
	Length   : 4147200
	Bytesused: 4147200
	Timestamp: 39203.709475s (Monotonic, End-of-Frame)

VIDIOC_STREAMOFF: ok

I also checked the format that v4l outputs like so:

$ v4l2-ctl -d /dev/video0 --list-formats-ext

ioctl: VIDIOC_ENUM_FMT
	Index       : 0
	Type        : Video Capture
	Pixel Format: 'RG12'
	Name        : 12-bit Bayer RGRG/GBGB
		Size: Discrete 1920x1080
			Interval: Discrete 0.033s (30.000 fps)

And here is the beginning (lsb) of the first line (3840 bytes) of the raw image:

00000001 00010001
10000001 00010001
10000001 00010000
10000001 00010001   <- 16 bits word
10000001 00010000
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001
00000001 00010001

Said file : dev1.raw (4.0 MB)

I thought that this format was only 12 bits, so I expected to see every 16 bits words to be padded by 4 zeroes, but as you can see some words are clearly 16 bits.

I saw the following answer : V4l2 returning 14-bit raw instead of 12-bit. What is going on? and it doesn’t help me because in this case they are 16 bits words.

Can you help me to understand what is going on here, so I can debayer the image ?

Thanks for your help!

Hi,

There is a specific pixel data format in place by the VI subsystem. You may need to take into account this specific pixel data format. There is this tool (nvraw tool) provided by the NVIDIA BSP to strip the image buffer from the additional data and get a true RAW buffer. Some documentation: Argus NvRaw Tool — Jetson Linux Developer Guide documentation

Also there are topics related to this that you may check out: Questions regarding VI's raw data pixel formatting

Hope this info helps,

Jafet Chaves,
Embedded SW Engineer at RidgeRun
Contact us: support@ridgerun.com
Developers wiki: https://developer.ridgerun.com/
Website: www.ridgerun.com

Hello @jchaves and thanks for your answer.

I would like to totally bypass argus and not use ‘nvargus_nvraw’ to validate that some faulty cameras are indeed faulty before being piped into the Tegra/camera core/argus pipeline.

Also, the link you provided just said that the RG12 format used by NX is not documented, not that it is actually the mysterious ‘nvraw’ format that Nvidia uses.

So my question is still valid, how to decode and visualize raw camera data without ISP processing?

Thanks !

Well, the value of the link I provided is on the fact that the information about the data format can be found on the TRM (technical reference manual) of the Xavier series. Also the link mentions some samples that you could take for reference about how to perform debayering using CUDA.

To display the RAW bayer or debayered frames you could use GStreamer or vooya.

In case you are looking for alternatives to the NVIDIA Argus software stack currently there is support for debayering using the GstCuda element, more information about it can be found at: GstCUDA - GstCUDA - cudadebayer - RidgeRun Developer Connection

Hey @jchaves and thanks for your answer !

In the link your provided, it’s explicitly said that this format is not documented :

To display the RAW bayer or debayered frames you could use GStreamer or vooya.

As far as I know nor GStreamer or Vooya supports padded 12 bit Bayer raw image.

In case you are looking for alternatives to the NVIDIA Argus software stack currently there is support for debayering using the GstCuda element.

I might be interested if it supports this particular format, but it is not available yet it seems.


Can someone from Nvidia chime in and give a clear description of the bytes I see in this file ? @JerryChang @ShaneCCC?

It’s clearly not any of those:

It is also not

d11 d10 d09 d08 d07 d06 d05 d04 d03 d02 d01 d00 d11 d10 d09 d08

Because of the following pixel:

00001111 11111110

Or am I missing something ?

Hi,

You are conflating several aspects here. Let me share some comments for your use case:

  1. You are indeed correct that RAW16 does not seem to be documented/supported. But that format is not what you are capturing. As you shared you are capturing 12 bits that due to how NVIDIAs raw memory work is actually 16 bits. Some reference and information on why this is the case here in past posts:
    V4l2 returning 14-bit raw instead of 12-bit. What is going on?
    Raw Bayer GRBG 12-bit Image Data from CSI-2
    Display bayer CSI camera output without ISP - #7 by RS64

  2. As far as I know for GStreamer or vooya there is no specific issue whatsoever with this format, it is a matter of what you are looking with these tools. With vooya, sure, the image may look strange (if you do not shift bits and/or eliminate duplicated bits) but it should be possible to display the RAW frame anyways. The specific information about the RAW frame layout should be in the TRM as I pointed out. With GStreamer is a similar story, you could create your own plugin to demosaic the RAW frames or use GstCUDA for example.

For Xavier and Orin they are program to T_R16 instead of T_R16_I

1 Like

Hey @jchaves and @ShaneCCC, and thanks for your answer!

Indeed I’m starting to get pretty confused about all that.

Can you confirm that I am in this case then ?

Where the first 12 bits contain the data and the rest can be discarded (right shift by 4) ?

Yes, I would agree with ShaneCCC information. You could try shifting the data 4 bits to the right for each pixel.

Just as an additional suggestion, if your image sensor supports it, you could try setting a test pattern so it makes it easier to verify the expected data (i.e your data processing is correct).

Hey @jchaves and @ShaneCCC

Thanks for your help, I managed to get a decoded image. I discarded the 4 lsb bits and that worked.

I also realized that the file had the bytes swapped, probably an endianness problem ?

I also had to tune the gain and exposure.

At the end I get an image that looks like this:

It’s progress, but the image is very washed out. Do you have any idea what could cause this issue?
I used Opencv and cv2.COLOR_BayerRG2RGB to debayer the image.

Thanks !

1 Like

First, excellent job getting the demosaic working on your own. Good stuff! Would you consider to share the code to the community? Perhaps someone can stumble upon a similar need in the future.

Now, for the image result, typically is the ISP (through appropriate tuning) what takes care of all that postprocessing so you can obtain very clear images. Since you are completely bypassing it you may need to consider several other factors:

  1. Type of lens, mounted (or lack of) filters in your camera.
  2. The need to add some white balance, contrast, tone mapping, denoising, etc processing in your demosaic processing pipeline.
  3. Color channel gains in your sensor configuration.
1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.