Full range RGB strange behavior on Nvidia Jetson AGX Xavier Developer Kit

I am using a Nvidia Jetson AGX Xavier Developer Kit with Jetpack 4.5.1 (L4T 32.5.1) and I intend to transmit full range RGB output video over HDMI. I intend to do a special encoding at the bit level, hence I need to keep the video untouched – “bit perfect”. I have set the color space to “RGB” and the color range to “Full” in the xorg.conf. However, I cannot transmit the video without some errors in the RGB values.

I have been doing some tests with the following test pattern:


Note that this test pattern has all the color levels from 0 to 255 for all color channels.

Capturing the output with a capture board, I have obtained this result:

The difference is that some specific color levels are being received with a “1 unit” error. That happens to all color channels (R, G and B) in all the color gradients bars on the image. The image below shows the errors (captured value minus expected value) for each color level from 0 to 255 (which happens in the same way to all color channels):

So, there are two values on the lower part of the range which have an error of 1 unit, and there is a set of values on the upper part of the range (except 255) which have an offset of 1 unit.

Note that this is not a question of precision loss in color space due to being compressed/decompressed through the HDMI RGB “limited range”. The limited range has already been tried and it shows errors higher and much more spread over the entire range than the errors I’m showing here (which occur with full range).

When connecting the HDMI output of the Xavier to a monitor, it is possible (but not easy) to see the errors.

I have tested the HDMI output with several monitors and capture boards. The results have been similar.

Has anyone had a similar problem or have any hint on what may be happening?

Is there any configuration to fix this or else is there any stuff that can be tried?

Hello,

I don’t think anyone reported such issue before. Can you help check if what I understand is correct and reply my questions?

  1. The full range function is working fine. But there are some pixel has 1 unit value drift.
  2. How did you find out this issue? What is your test setup?
  3. How is the reproduce rate of this issue? You said it seems not easy to reproduce errors.
  4. Will you see such behavior if you use other platforms like x86 host machine?

The full range function is working fine. But there are some pixel has 1 unit value drift.

A: Correct. I can see differences depending on whether full or limited range is set up (xorg or randr) in Xavier. When using limited range I can see the expected precision errors introduced by compressing and decompressing the range. When using full range there are some drifts but much less abundant – those errors certainly could NOT be due to limited range or they would be in more quantity.

How did you find out this issue? What is your test setup?

A: I discovered this issue when trying to encode an additional color channel (an alpha actually) in the LSBs of the R, G and B channels. I transmitted the output video of the Nvidia Jetson AGX Xavier to a capture board over HDMI and I displayed the test pattern in full screen on the Xavier. With the capture board I can capture a video still, analyze its pixels values numerically and generate the plots like the one showed in the previous post. With a screen, by visual inspection I can see the “defects” in the same color levels that are shown in the plot. The “defects” can be perceived in a screen as irregular steps in the gradient bar (which are seen in the same place as seen in the plot).

How is the reproduce rate of this issue? You said it seems not easy to reproduce errors.

A: The issue is always reproducible. It is just not easy to see with naked eye in a screen, because the evaluation is based on visual inspection and it may depend on the screen settings — it must be a good screen, with good range. When looking at the numeric values of the pixels, using a video capture card, the drift always exists, it is reproducible. The difficulty with capture cards is that the many I’ve tried do not seem to support full range. The difficulty with capture cards is that the many I’ve tried do not seem to support full range. I was able to capture full range only with a Datapath card in a desktop and with a TX2 mounted on an Auvidea J130 carrier board that has HDMI video input. The issue occurs with both capturing methods and presents the exact same value drifts.

Will you see such behavior if you use other platforms like x86 host machine?

A: I have also tested the outputs of a Nvidia and an Intel graphics cards from common laptops and desktops, using full range. With them, there are no drifts perceptible on screens. Also, there are no drifts/errors in any part of the range when checking the numerical values captured with a capture board.

Note: The image below is the image captured with the Datapath capture board that is in my previous post, but I have added some yellow lines to highlight the image columns where the errors are visible.

Hello,

I notice you mention Auvidea board here. Is it possible to try on NV devkit?

Thank you.
Notice, the Auvidea board and the TX2 were used ONLY as a capture board.
And the captured values are exactly equal to the ones captured by a Datapath card mounted on a desktop PC, which (plus the visible drifts on screens) point to an issue more likely in the Xavier HDMI output, rather than an input issue.

Oh, sorry. Forgot this is an Xavier issue. Will check this with internal team.

Hi,

Sorry for one more question here.

Is the location where the error happens always the same one? According to your picture here, you labeled the yellow lines for us to locate the error. Is it always same column to hit this issue? I mean will it change if you reboot the device and capture the result again?

Also, how do we reproduce this issue locally? Any specific commands?

Can we use your color bar image here and use image viewer app on our AGX Devkit?

Hello,

Is the location where the error happens always the same one? According to your picture here, you labeled the yellow lines for us to locate the error. Is it always same column to hit this issue? I mean will it change if you reboot the device and capture the result again?

A: The errors occur always in the columns of the pattern. Actually, the errors happen always in those specific values, in all color channels, no matter the position in the image. The errors never change even if we reboot the system.

Also, how do we reproduce this issue locally? Any specific commands?
Can we use your color bar image here and use image viewer app on our AGX Devkit?

A: To reproduce the errors, you must change the /etc/X11/xorg.cong to set the color space to RGB and the color range to Full. See below the changes that I have done in the /etc/X11/xorg.cong file:

# Copyright (c) 2017, NVIDIA CORPORATION.  All Rights Reserved. 
# 
# This is the minimal configuration necessary to use the Tegra driver. 
# Please refer to the xorg.conf man page for more configuration 
# options provided by the X server, including display-related options 
# provided by RandR 1.2 and higher. 

# Disable extensions not useful on Tegra. 
Section "Module" 
    Disable     "dri" 
    SubSection  "extmod" 
    Option  "omit xfree86-dga" 
    EndSubSection 
EndSection

Section "Device" 
    Identifier  "Tegra0" 
    Driver      "nvidia" 
    Option      "AllowEmptyInitialConfiguration" "true" 
    Option      "ColorSpace"     "RGB" 
    Option      "ColorRange"     "Full" 
EndSection 

Then you can use the test color bar image I have shared with you in the first post – the first one: “testColor”. You must open it with a viewer that does not change the test pattern when showing it, for example, gpicview. To open the image, follow the steps below:

  1. Right-click on the image and press “Open With Other Application”.
  2. Select the Image Viewer (gpicview) as shown in the image below:
  3. After opening the image with the viewer, enter in fullscreen mode.

I notice the method you use to enable full range seems different from our previous method.

    Section "Screen"
          Identifier "Default Screen"
          Monitor "HDMI-0"
     Option "ColorRange" "Full"
     EndSection

Can you try this too?
Though this may not resolve the problem, just want to align the method we are using here.

Thank you.
I have just changed the method to enable the full range based on your suggestion.
The output has the same errors. This does not solve the problem.

Hi everbody,

I’ve been working with the OP on this issue, and I’ve come up with a testing image that can be used to evaluate Xavier. I think it may help “visualize” the problem, although I admit it can bring more confusion to the thread instead.

One of the isolated errors we are getting is in the value 15 (in 0-255), which is being converted to value 14 by the Xavier full range HDMI output.
To demonstrate that, I created a picture with a gray background of value 14 (i.e., RGB 14,14,14) and then put a star shape with value 13 on the left and a star with value 15 on the right:

The difference between the foreground shape colors and the background is so faint that it is only possible to see them when seeing the picture in fullscreen (with no other brighter elements on the screen), and in a dark room.

An exaggeration of what can be seen is depicted here:

expected_vs_XavierHDMI

While in a standard laptop or desktop (with good screens), it is possible to perceive both left and right side shapes, on the Xavier the error with the value 15 getting drifted to 14 makes the right side shapes become invisible.

Note: this is only meaningful in full range. When using limited range the compression deviations may happen in different parts of the range and result in different outcomes.

This method seems helpful here. Let me try it with my xavier.

Thanks for sharing.

Hi @cdsousa and @diogoervazp4eup ,

Sorry that we tried 3 monitors today but it is not very easy to see the star on the right hand side.

Just want to confirm: When xavier does not enable full range, we shall see both stars on the screen right? Maybe the test environment is not good enough to check this…

What kind of monitor are you using there to observe the stars? Also, the “expected” image here has configured the contrast so that we can easily see that, right?

@WayneWWW , thank you very much for your efforts, we really appreciate it!

As far as I know, by what @diogoervazp4eup told me, with HDMI limited range in Xavier, it is the opposite: one cannot see the left shape, which just tells that the 13 and 14 gray levels are “compressed” to the same value but not the gray 15. In this case the color depth loss is expected, but I’m not sure if this behavior is always the same across devices.

To tell whether Xavier (or another device) is transmitting in limited or full range we usually use the gradient bar test picture (the first one @diogoervazp4eup posted). In limited range there are irregular steps/bands perceptible in the entire gradient bar. (or in some devices we get a not fully dark black and not fully bright white colors at the gradient extremes).

Also, the “expected” image here has configured the contrast so that we can easily see that, right?

Yes

Regarding the monitor, we are able to spot the shapes on almost all screens as long as we have a darker room, so our eyes adapt better to the dark range. To first check what’s supposed to see in the picture, I’d recommend using a laptop screen as those usually use the full 24bit color depth and thus can reproduce all 3 grays. Then try with an external screen connected to a common laptop/desktop, verify both shapes are seen, and then try the with the Xavier.

Here are photos of what I see in my laptop screen and in an external monitor connected to my laptop.
@diogoervazp4eup , can you please try taking a photo of the picture we are getting on a screen connected to the Xavier (in full range)?

The image below shows what we see on a screen connected to the Xavier (in full range).

1 Like

with HDMI limited range in Xavier, it is the opposite: one cannot see the left shape, which just tells that the 13 and 14 gray levels are “compressed” to the same value but not the gray 15. In this case the color depth loss is expected, but I’m not sure if this behavior is always the same across devices.

Looks like this is different from our case. Thus, it may differ between devices…

I think using capture card or analyzer may be more reliable here. We will try to see if we can get one.

BTW, does this issue happen in some middle value but not at the edge of either 16 or 235? If you make the star at that value, maybe we can prevent the uncertain behavior between each devices.

Yes, I agree, with a capture card we can be sure about the numerical values.
Since we were getting the errors in the values with the capture cards, we tried using screens and test images, ending up being able to “visualize” those errors in the screens too.

I wouldn’t care too much with the limited range case since it is expected to get some modifications in color values, which may differ between devices. The real issue is with the full range where I think it is reasonable to expect “bit perfect” matching values between the video frames being sent and the video frames being received.

BTW, does this issue happen in some middle value but not at the edge of either 16 or 235? If you make the star at that value, maybe we can prevent the uncertain behavior between each devices.

The errors happen in, and only in, the following values (as depicted by the plot in the first post):

  • The value 12 is changed to 11
  • The value 15 is changed to 14
  • Any values between, and including, 212 and 254 are changed to the next value (for instance, 212 is changed to 213; 213 is changes to 214; up to 254 which is changed to 255)

All other values are transmitted correctly.

Hi @WayneWWW, I kind of have good news.

I was messing up with kernel and driver parameters in /sys/ (exploring an unrelated thing – “how to force an EDID”) and I found that toggling this two parameters:
/sys/class/graphics/fb0/device/cmu_enable to OFF and /sys/kernel/debug/tegradc.common/tegra_win.0/degamma/force_user_degamm to ON, completely fixed the issues we presented before.
We tested with a capture card and the values are passed “bit perfect” through the HDMI.

To switch the flags I did:

echo 0 | sudo tee /sys/class/graphics/fb0/device/cmu_enable
echo 1 | sudo tee /sys/kernel/debug/tegradc.common/tegra_win.0/degamma/force_user_degamma

Now the issue is that I have no idea what I’m doing 😅
I may guess that this has something to do with gamma correction but otherwise, I can’t even discover what CMU stands for.

Can you, @WayneWWW, or someone else provide an explanation on what these flags accomplish, and maybe then come up with an idea of why the default behavior shows the problems we had faced?

Thanks