HDMI and DisplayPort (Dual display mode) Bandwidth issue at Jetpack4.2(kernel 4.9.140 tegra)

Both “Display crashes” and “CPU dies”.

Logging means that the CPU is alive and the display is broken,
At this time, there is a kernel error log like the one we sent.

When the CPU is dead, there is no response like everything is stopped (there is no log).
(Like a still image from a photo …)

I think the scheduling problem (Is it called memory arbiter?) for the memory occupation of the bus master called CPU and GPU.
(Similar to the issue I had when I was developing SoC in the past)

bk1472,

I think we should narrow down the usecase to know what is the exact way to reproduce issue.

Are these two issues same? Please note that in the first comment you said you didn’t do anything.
However, now you say there is a camera application + youtube running and it seems uart log has something error. This behavior seems not what you said in the beginning. Are you still using two display in this case?

Are you still able to interact with system with UART console when display is out for both cases?

Sorry to confuse you first.
I also tried to reproduce yesterday, but it did not reproduce well.
I’m sorry I did not remember.

When I used Audio Video resources on TX2, I had two issues (CPU or GPU hang) relatively frequently.

When I left without doing anything, it happened very slowly or did not happen.


Suddenly I think about it. Is there a problem related to power consumption?


Thank you.
Regards
BK

We really cannot help you if there is no (full) error log. Please make sure what is the true error and see if you can see any log.

I think you could try more tests with different combinations.

For example, one monitor case/dual monitor case or running application/no application case.
Check if the error symptoms are same or not.

Ok I will try

I think it would be taken long time to reproduce symptom

Hello Mr. WayneWWW

I have started to reproduce the problem at present.
The three test methods are shown below.

Common application:
  1) nvpmodel -m 0
  2) run / usr / bin / jetson_clocks

(1) Connect HDMI monitor and DP monitor, and system would be neglected (no APP is performed)

(2) Connect HDMI monitor and DP monitor, launch Youtube (chromium web brower), and run Camera APP

(3) Connect HDMI monitor, run Youtube (chromium web brower), run Camera APP

If there are any meaningful results after the test, we will let you know.

And did you analyze the log (# 19 msg) that I posted before?
(Log of the phenomenon that HDMI output became blackout)

I show you again the log below.
[42671.334857] hrtimer: interrupt took 126944 ns
[67487.615996] nvgpu: 17000000.gp10b __nvgpu_timeout_expired_msg_cpu: 94 [ERR] Timeout detected @ nvgpu_0
[67487.631760] nvgpu: 17000000.gp10b _nvgpu_timeout_expired_msg_cpu: 94 [ERR] Timeout detected @ nvgpu
[67487.783563] nvgpu: 17000000.gp10b __nvgpu_timeout_expired_msg_cpu: 94 [ERR] Timeout detected @ nvgpu_0
[67487.799198] nvgpu: 17000000.gp10b _nvgpu_timeout_expired_msg_cpu: 94 [ERR] Timeout detected @ nvgpu
[67487.949303] nvgpu: 17000000.gp10b __nvgpu_timeout_expired_msg_cpu: 94 [ERR] Timeout detected @ nvgpu_0

It is common that any issue from gpu driver (gk20a/gp10b) would cause HDMI to blank. To analyze this issue, we need to know how to reproduce issue, so please test it first.

How to reproduce the problem
It is indicated in # 19 message.

It does not make sense to continue to ask the same questions.

bk1472,

Is there any extra findings from #26?

(1) Connect HDMI monitor and DP monitor, and system would be neglected (no APP is performed)
→ Would this one hit the gpu timeout error? or any other error?

(2) Connect HDMI monitor and DP monitor, launch Youtube (chromium web brower), and run Camera APP
→ Sounds like this would reproduce issue as #19. If so, could you share what camera app you are running?

(3) (3) Connect HDMI monitor, run Youtube (chromium web brower), run Camera APP
→ How about this one?

Also, do you have extra tx2 modules to do the test?

I Could’t find any other msg.

Error msg’s displing is just one time, and could’t recovered until reboot.

I used D435 (intel 3 depth camera)
You can download software (GitHub - IntelRealSense/librealsense: Intel® RealSense™ SDK)

And I use realsense-viewer : conncectd usb3.0)

And error occurring case is number
[1] - I had an issue a week after I started testing.
[2] - I had an issue a day after I srarted testing.
[3] - Issue case not happen.

Thanks
Regards
BK.

bk1472,

Do both cases (1 and 2) have same issue (gpu timeout)? Could you share the full dmesg?

I just get CPU error Msg!

please below message!

[ 1255.033412] CPU4: SError detected, daif=140, spsr=0x60000045, mpidr=80000102, esr=bf40c000
[ 1255.042276] CPU3: SError detected, daif=140, spsr=0x60000045, mpidr=80000101, esr=bf00c002
[ 1255.042298] CPU0: SError detected, daif=140, spsr=0x60000045, mpidr=80000100, esr=bf40c000
[ 1255.042330] CPU5: SError detected, daif=140, spsr=0x60000045, mpidr=80000103, esr=bf40c000
[ 1255.042404] ROC:IOB Machine Check Error:
[ 1255.042793] CPU2: SError detected, daif=140, spsr=0x40000145, mpidr=80000001, esr=be000000
[ 1255.042882] CPU1: SError detected, daif=140, spsr=0x40000145, mpidr=80000000, esr=be000000
[0000.250] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-1b47d5d9)
[0000.258] I> bit @ 0xd480000
[0000.261] I> Boot-device: eMMC
[0000.265] I> sdmmc bdev is already initialized
[0000.270] I> pmic: reset reason (nverc) : 0x0
[0000.276] I> Found 16 partitions in SDMMC_BOOT (instance 3)
[0000.284] I> Found 31 partitions in SDMMC_USER (instance 3)
[0000.290] I> A/B: bin_type (16) slot 0
[0000.293] I> Loading partition bpmp-fw at 0xd7800000
[0000.298] I> Reading two headers - addr:0xd7800000 blocks:1
[0000.304] I> Addr: 0xd7800000, start-block: 58777608, num_blocks: 1
[0000.319] I> Binary(16) of size 532656 is loaded @ 0xd7800000
[0000.324] I> A/B: bin_type (17) slot 0
[0000.328] I> Loading partition bpmp-fw-dtb at 0xd79f0000
[0000.333] I> Reading two headers - addr:0xd79f0000 blocks:1
[0000.339] I> Addr: 0xd79f0000, start-block: 58780024, num_blocks: 1
[0000.353] I> Binary(17) of size 466112 is loaded @ 0xd798e000
[0000.531] I> Loading SCE-FW …
[0000.534] I> A/B: bin_type (12) slot 0
[0000.538] I> Loading partition sce-fw at 0xd7300000
[0000.542] I> Reading two headers - addr:0xd7300000 blocks:1
[0000.548] I> Addr: 0xd7300000, start-block: 58782024, num_blocks: 1
[0000.557] I> Binary(12) of size 125168 is loaded @ 0xd7300000
[0000.563] I> Init SCE
[0000.565] I> Loading APE-FW …
[0000.568] I> A/B: bin_type (11) slot 0
[0000.572] I> Loading partition adsp-fw at 0xd7400000
[0000.577] I> Reading two headers - addr:0xd7400000 blocks:1
[0000.582] I> Addr: 0xd7400000, start-block: 58761224, num_blocks: 1
[0000.591] I> Binary(11) of size 106240 is loaded @ 0xd7400000
[0000.597] I> Copy BTCM section
[0000.600] I> A/B: bin_type (13) slot 0
[0000.604] I> Loading partition cpu-bootloader at 0x96000000
[0000.609] I> Reading two headers - addr:0x96000000 blocks:1
[0000.615] I> Addr: 0x96000000, start-block: 58740744, num_blocks: 1
[0000.626] I> Binary(13) of size 275920 is loaded @ 0x96000000
[0000.632] I> A/B: bin_type (20) slot 0
[0000.636] I> Loading partition bootloader-dtb at 0x8520f400
[0000.641] I> Reading two headers - addr:0x8520f400 blocks:1
[0000.646] I> Addr: 0x8520f400, start-block: 58742792, num_blocks: 1
[0000.659] I> Binary(20) of size 344128 is loaded @ 0x8520f400
[0000.665] I> A/B: bin_type (14) slot 0
[0000.668] I> Loading partition secure-os at 0x8530f600
[0000.673] I> Reading two headers - addr:0x8530f600 blocks:1
[0000.679] I> Addr: 0x8530f600, start-block: 58744840, num_blocks: 1
[0000.688] I> Binary(14) of size 83360 is loaded @ 0x8530f600
[0000.695] I> boot profiler @ 0x275844000
[0000.700] I> Unhalting SCE
[0000.702] I> Primary Memory Start:80000000 Size:70000000
[0000.707] I> Extended Memory Start:f0110000 Size:1856f0000
[0000.714] I> MB2(TBoot-BPMP) done

Hi bk1472,

Thanks for sharing. Do both display go into power save mode during the week long test?

Sorry, Nothing yet!

I’ll try more!

And one question more.
Is it engage to power consumption? (over heat???)

thanks
regards

BK

Hi,

Please contact nv sales to highlight your issue if it is still not get resolved. This issue seems require long run test and on custom carrier board. I can not share much on forum.