Underflow and freezing nvdisplay arise on our board with two HDMI ports

Hi,

Our career board has two HDMI ports.
When I connect two displays to the board, underflow and freezing nvdisplay sometimes arise.
I tried changing displays and editing settings, but couldn’t fix completely.
How can I fix it, and/or how can I recover from freezing without restart?

The conditions and procedures which lead to freezing are:

  • Conditions
  • Carrier board: "HDMI-0" uses SOR and nvdisplay@15200000. "HDMI-1" uses SOR1 and nvdisplay@15210000
  • Displays: 1920x1080 and 3840x2160
  • SampleRootFS, Uboot in BSP rel-28.2
  • Our devicetree based on BSP rel-28.2
  • added Option "TegraReserveDisplayBandwidth" "false" to /etc/X11/xorg.conf
  • $ nvpmodel -m 0
  • $ ./jetson_clock.sh
  • Procedures
  1. start TX2
  2. unplug the cables from the board
  3. plug the cables to the HDMI ports simultaneously.
  4. HDMI-1 sometimes freezes regardless of whether FHD or 4K

And the logs when nvdisplay freezes are attached. (Some Device names are modified)

Thanks in advance.
syslog.txt (204 KB)
UART.txt (92.5 KB)
Xorg.0.log.txt (20.4 KB)
trace.txt (18 MB)

1 Like

Hi okator,
Our customer board have two HDMI too,one of that connected DP0 another connected DP1. But both of them did not work.
And system log show this message.
“tegradc 15210000.nvdisplay: sanitize_flip_args: WIN 3 invalid size:w=0,h=0,out_w=0,out_h=0”

According to the dmesg, looks like you two have different error.

okotar,

I notice your monitor HDMI-1 has below error message.

[  162.039591] tegradc 15210000.nvdisplay: blank - powerdown
[  162.045007] tegradc 15210000.nvdisplay: unblank
[  162.045019] PD DISP2 index4 UP
[  162.045550] Parent Clock set for DC plld2
[  162.051062] tegradc 15210000.nvdisplay: hdmi: pclk:594000K, set prod-setting:prod_c_600M
[  163.101179] tegradc 15210000.nvdisplay: unblank
[  163.177441] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  163.183977] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  164.241443] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  164.247978] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  165.401445] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  165.407979] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  166.529441] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  166.535979] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  167.102068] tegradc 15210000.nvdisplay: hdmi: scdc scrambling status is reset, trying to reconfigure.
[  167.681441] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  167.687980] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  168.761441] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  168.767979] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  169.845443] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  169.851980] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  170.929442] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  170.935978] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  172.013439] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  172.019979] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  172.158058] tegradc 15210000.nvdisplay: hdmi: scdc scrambling status is reset, trying to reconfigure.
[  173.097441] tegradc 15210000.nvdisplay: dc_poll_register 0x41: timeout
[  173.103975] tegradc 15210000.nvdisplay: dc timeout waiting for DC to stop
[  177.214043] tegradc 15210000.nvdisplay: hdmi: scdc scrambling status is reset, trying to reconfigure.
[  182.270060] tegradc 15210000.nvdisplay: hdmi: scdc scrambling status is reset, trying to reconfigure.
[  187.326055] tegradc 15210000.nvdisplay: hdmi: scdc scrambling status is reset, trying to reconfigure.

Is your HDMI-1 successfully detected?

We also met such issue before. This is caused by unsupported mode for some monitor. Could you try to find other HDMI monitor to replace current HDMI-1 monitor as a test? To be more precisely, please try to find one monitor for which its default mode is not 4k.

WayneWWW,

Is your HDMI-1 successfully detected?
In the logs posted previously, HDMI-0 and HDMI-1 were successfully detected first time (begging boot - 149s in the logs).
Then I unplugged and plugged again, HDMI-1 wasn’t detected (after 150s).

This is caused by unsupported mode for some monitor.
I’m not sure about it, but I sometimes meet that xrandr turns off outputs to two HDMIs (you can see in logs attaced this post).
I can recover it by xrandr commands and I get a GUI message on Ubuntu, like below:

Could not switch the monitor Configuration
could not set the configuration for CRTC 395

Could you try to find other HDMI monitor to replace current HDMI-1 monitor as a test? To be more precisely, please try to find one monitor for which its default mode is not 4k.
I tried two FHD monitors but it still occurs. I attach logs.
Xorg.0.log.txt (51.6 KB)
UART.txt (165 KB)
syslog.txt (1.63 MB)

Hi okotar,

It looks like the error only happens to tegradc 15210000.nvdisplay, which is HDMI-1.

I would like to know if single display (only HDMI-0 or only HDMI-1) would hit such error. Could you help test?

Let’s focus on easier case first.

Hi WayneWWW,

I tested with single display and the freezing error didn’t occur.

  • HDMI-0 with FHD, plug/unplugging x100 times
  • HDMI-1 with FHD, plug/unplugging x100 times
  • HDMI-0 with 4k, plug/unplugging x100 times
  • HDMI-1 with 4k, plug/unplugging x100 times

Additionally, I tested plug/unplugging either display under dual display condition.

    - HDMI-0 with FHD and HDMI-1 with FHD,
  • plug/unplugging HDMI-0 x100 times
  • plug/unplugging HDMI-1 x100 times
  • - HDMI-0 with FHD and HDMI-1 with 4k,
  • plug/unplugging HDMI-0 x100 times
  • plug/unplugging HDMI-1 x100 times
  • - HDMI-0 with 4k and HDMI-1 with FHD,
  • plug/unplugging HDMI-0 x100 times
  • plug/unplugging HDMI-1 x100 times
  • - HDMI-0 with 4k and HDMI-1 with 4k,
  • plug/unplugging HDMI-0 x100 times
  • plug/unplugging HDMI-1 x100 times

Any patterns didn’t hit such error, but I found another conditions and procedures which lead freesing:

  • Conditions
  • Carrier board: "HDMI-0" uses SOR and nvdisplay@15200000. "HDMI-1" uses SOR1 and nvdisplay@15210000
  • Displays: "HDMI-0 with FHD and HDMI-1 with 4k" or "HDMI-0 with 4k and HDMI-1 with 4k"
  • SampleRootFS, Uboot in BSP rel-28.2
  • Our devicetree based on BSP rel-28.2
  • added Option "TegraReserveDisplayBandwidth" "false" to /etc/X11/xorg.conf
  • $ nvpmodel -m 0
  • $ ./jetson_clock.sh
  • Procedures
  1. start TX2
  2. unplug the cable of HDMI-1 from the board
  3. wait until HDMI-1 display moving into power saving mode
  4. plug the cable to the HDMI port
  5. HDMI-1 sometimes freezes

okotar,

May I ask what is your HDMI-1 in previous comment? It sounds like a physical HDMI monitor but not a HDMI port on tegra, is it?

If it turns out that HDMI-1 may have some problem, could you try to use HDMI-1 on nvidia devkit first?

Since we cannot guarantee your hardware design is totally correct, to avoid such hardware issue affecting our debug, please verify your monitor on nvidia devkit first.

If both monitors work well on nvidia devkit, we then move back to your carrier board.

BTW, when you talk about HDMI-1 with 4k and HDMI-1 with FHD, are you talking about changing a mode on the same monitor?

WayneWWW,

May I ask what is your HDMI-1 in previous comment? It sounds like a physical HDMI monitor but not a HDMI port on tegra, is it?
No, I said HDMI-1 as a HDMI port on our carrier board. Sorry for a lack of explanation.
The carrier board has two HDMI ports and I call them HDMI-0 port and HDMI-1 port.
I summarize and clarify the previous comments:
=====
The error only arise on tegradc 15210000.nvdisplay, which is the HDMI-1 port, under the below cases.

I reported the following case in #1 and #2:

[Case A]

  • Conditions
  • SampleRootFS, Uboot in BSP rel-28.2
  • Our devicetree based on BSP rel-28.2
  • added Option "TegraReserveDisplayBandwidth" "false" to /etc/X11/xorg.conf
  • $ nvpmodel -m 0
  • $ ./jetson_clock.sh
  • Connecting ports and displays
  • HDMI-0 port connects with FHD display and HDMI-1 port connects with 4k display
  • or
  • HDMI-0 port connects with FHD display and HDMI-1 port connects with FHD display
  • Procedures
  1. start TX2
  2. unplug the cables from the board
  3. plug the cables to the HDMI ports simultaneously
  4. HDMI-1 port sometimes freezes regardless of whether that the HDMI-1 port connects with FHD display or 4k display

and another case was found later in #6:

[Case B]

  • Conditions
  • SampleRootFS, Uboot in BSP rel-28.2
  • Our devicetree based on BSP rel-28.2
  • added Option "TegraReserveDisplayBandwidth" "false" to /etc/X11/xorg.conf
  • $ nvpmodel -m 0
  • $ ./jetson_clock.sh
  • Connectiong ports and displays
  • HDMI-0 port connects with FHD display and HDMI-1 port connects with 4k display
  • or
  • HDMI-0 port connects with 4k display and HDMI-1 port connects with 4k display
  • Procedures
  1. start TX2
  2. unplug the cable connected with HDMI-1 port from the board
  3. wait until the 4k display unplugged from HDMI-1 port moves into power saving mode
  4. plug the cable to the HDMI-1 port
  5. HDMI-1 port sometimes freezes

=====

If it turns out that HDMI-1 may have some problem, could you try to use HDMI-1 on nvidia devkit first?
As I mentioned above, HDMI-1 is the port on our carrier board.
And I tested by three different 4k displays with our carrier board but the error arose. So it’s hard to suspect displays as the cause.

BTW, when you talk about HDMI-1 with 4k and HDMI-1 with FHD, are you talking about changing a mode on the same monitor?
No, I use native FHD monitors and native 4k monitors.

Ok, that is much clear now.

It sounds like HDMI-1 port always has error. How about the single port usecase(HDMI-1 only)? Would that hit error?
If single port also has error, could you make sure it is not a hardware design problem?

If you believe it definitely matches the OEM product design guide, please share your board schematics.

However, if single port has no problem, please let me know.

okotar,

Any update?

WayneWWW,

How about the single port usecase(HDMI-1 only)? Would that hit error?
I tried HDMI-1 only case.
When I repeated plugging and unplugging 4k monitor quickly, it hit error.

So we’re investigating our carrier board design.
Additionally, we’re trying to verify monitors on nvidia devkit again.

Thanks for your patience.

okotar,

Any update for this issue?

WayneWWW,

I found that devkit hit the error when I repeated plugging/unplugging HDMI quickly.

  • Conditions
  • BSP R28.2
  • Devkit
  • added Option "TegraReserveDisplayBandwidth" "false" to /etc/X11/xorg.conf
  • $ nvpmodel -m 0
  • $ ./jetson_clock.sh
  • Monitors
    Three different 4k-monitors and one FHD monitor.
    All monitors hit the error.

  • Procedures

  1. start TX2
  2. repeat plugging/unplugging at interval of a few seconds
  3. HDMI sometimes freeze

I attached logs.
c_4k-monitor.txt (97.5 KB)
b_4k-monitor.txt (104 KB)
d_fhd-monitor.txt (93.4 KB)

okotar,

Thanks for update.

1.Could you share what monitors are you using for above test?
In fact, the error “hdmi: scdc scrambling status is reset, trying to reconfigure.” was once reported by some customers on rel-28.1. It turned out the cause of such issue is due to incompatible monitors with tegra. The workaround is in listed in our document “L4T document → Kernel Customization → Display Configuration and bringup → Hard-coding kernel display boot mode for HDMI” Could you take a look at that section?

  1. Is quick hotplug a necessary step for reproducing this issue?

I mean if there is no quick hotplug, would the 4k monitor hits underflow and " dc_poll_register 0x41: timeout" error?

WayneWWW,

1.Could you share what monitors are you using for above test?

The two of the three 4k monitors are below:

  • acer et322qk
  • eizo ev2785

They aren’t listed in “Hard-coding kernel display boot mode for HDMI”. And another one isn’t listed in it, too.
About the error “hdmi: scdc scrambling status is reset, trying to reconfigure”, it didn’t appear with FHD monitor (see the log “d_fhd-monitor.txt” in my previous post).

  1. Is quick hotplug a necessary step for reproducing this issue?

After the previous post, I find quick hotplug isn’t a necessary step.
Unplugging HDMI on certain timing trigger the error.
I made a reproducing script. Could you try it on your devkit?

  • Conditions
  • TX2, BSP R28.2, Devkit
  • added Option TegraReserveDsiplayBandwidth false to /etc/X11/xorg.conf
  • sudo nvpmodel -m 0
  • sudo ./jetson_clocks.sh
  • Connected a monitor via HDMI
  • installed "sleepenh" package for the script
  • Prosedures
sudo su
bash hdmi-toggling-tester_devkit.sh 1.10 1
  • Tested Monitors
  • acer et322qk
  • eizo ev2785
  • another 4k monitor
  • FHD monitor
  • Notes
  • The script uses a "sleepenh" function and linux can't execute them on precise timing so you might need to try several times
  • You might need rewrite of "1.10" to "1.09" or "1.11" etc.

I also attached a log with FHD monitor.

FHD_log.txt (94 KB)
hdmi-toggling-tester_devkit.sh.txt (1.02 KB)

HI okotar,

Do you indicate that this is not related to specific monitor? The content in this script is also similar to frequently hotplug. I would try this script with my monitors.

Please note that the list on L4T document is just a list of monitors that “nvidia have” or “customer reported”, so not on the list does not mean it could work.

The “hdmi: scdc scrambling status is reset, trying to reconfigure” will only show for 4k monitor. Some monitors on the list also hit such error before, that is why I guess your monitor also suffer similar problem.

okotar,

We can reproduce the issue by your script now. Still investigating.

okotar,

I just tried to use a larger interval in your script and found the issue cannot be reproduced.

We need to check if this request is reasonable or not. Does your usecase need such frequency to on/off or hotplug display?

This thread has been very long and the synopsis changes many times, so just want to double confirm your true usecase.

WayneWWW,

No, our usecase doesn’t need frequently hotplug.

Our carrier board with two HDMI ports hits the same error even though it’s used in usual usecases.
One of the usecases is posted in #8 and I suspect that other usual usecases also can hit the error.
I think the cause of the issue on devkit and the cause of the issue on our board are same.

So I’d like to ask you:

  • how to prevent the error
  • how to recover from the error without TX2 restart (if the error is unpreventable)
  • how to detect the error and restart TX2 automatically (if the error is unpreventable and unrecoverable)
  • what the mechanism of the error is. I have to explain to our team why we can/can’t prevent/recover the error.

The script is just for reproducing the error on devkit regardless of monitor.
Ideally, I hope the issue on the script will be fixed completely but I also understand that it’s unusual usecase.

okotar,

The problem here is that the way to repro issue on devkit is not the same way on your carrier board.

In #8, it looks like just a hotplug. Could you repro this issue on devkit by using same method as #8?