R35.2.1 Xavier NX tegra194_cbb_err_isr

We are using Xavier NX + R35.2.1 L4T3 and there is a probability that the kernel will Oops. the related startup print is as follows, can you please tell me what might be the cause of the above?

[    6.744468] hub 2-0:1.0: 4 ports detected
[    6.799133] CPU:0, Error: cbb-noc@2300000, irq=15
[    6.809259] **************************************
[    6.809449] CPU:0, Error:cbb-noc
[    6.809574]  Error Logger            : 0
[    6.809713]  ErrLog0                 : 0x80030000
[    6.809846]    Transaction Type      : RD  - Read, Incrementing
[    6.810012]    Error Code            : SLV
[    6.810122]    Error Source          : Target
[    6.810252]    Error Description     : Target error detected by CBB slave
[    6.810483]    AXI2APB_5 bridge error: RDFIFOF - Read Response FIFO Full interrupt
[    6.810777]    Packet header Lock    : 0
[    6.810915]    Packet header Len1    : 3
[    6.811033]    NOC protocol version  : version >= 2.7
[    6.811179]  ErrLog1                 : 0x35162e
[    6.811344]  ErrLog2                 : 0x0
[    6.812313]    RouteId               : 0x35162e
[    6.815815]    InitFlow              : ccroc_p2ps/I/ccroc_p2ps
[    6.820800]    Targflow              : host1x_p2pm/T/host1x_p2pm
[    6.825354]    TargSubRange          : 11
[    6.828846]    SeqId                 : 0
[    6.831393]  ErrLog3                 : 0x30124
[    6.834885]  ErrLog4                 : 0x0
[    6.837529]    Address accessed      : 0x155f0124
[    6.842061]  ErrLog5                 : 0xb89f851
[    6.845218]    Non-Modify            : 0x1
[    6.848627]    AXI ID                : 0x17
[    6.851780]    Master ID             : CCPLEX
[    6.855358]    Security Group(GRPSEC): 0x7e
[    6.859305]    Cache                 : 0x1 -- Bufferable
[    6.863505]    Protection            : 0x2 -- Unprivileged, Non-Secure, Data Access
[    6.870326]    FALCONSEC             : 0x0
[    6.873734]    Virtual Queuing Channel(VQC): 0x0
[    6.878373]  **************************************
[    6.883205] ------------[ cut here ]------------
[    6.887657] kernel BUG at drivers/soc/tegra/cbb/tegra194-cbb.c:2057!
[    6.894227] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[    6.899738] Modules linked in:
[    6.902715] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.104-tegra #2
[    6.909348] Hardware name: Unknown NVIDIA Jetson Xavier NX Developer Kit/NVIDIA Jetson Xavier NX Developer Kit, BIOS r35.0-716592d-dirty 04/17/2023
[    6.922391] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
[    6.928470] pc : tegra194_cbb_err_isr+0x19c/0x1b0
[    6.932908] lr : tegra194_cbb_err_isr+0x11c/0x1b0
[    6.937640] sp : ffff800010003df0
[    6.941296] x29: ffff800010003df0 x28: 0000000000000001
[    6.946452] x27: 0000000000000080 x26: ffffb03266d50698
[    6.952135] x25: ffffb032676d0f40 x24: 0000000000000001
[    6.957576] x23: ffffb03267036000 x22: ffffb032674de920
[    6.962641] x21: 000000000000000f x20: 0000000000000005
[    6.968409] x19: ffffb032674de910 x18: 0000000000000010
[    6.973832] x17: 0000000000000007 x16: 000000000000000e
[    6.978915] x15: ffffb03267352bf0 x14: 0720072007200720
[    6.984684] x13: 0720072007200720 x12: 0720072007200720
[    6.989851] x11: 0720072007200720 x10: 0720072007200720
[    6.995624] x9 : 0720072007200720 x8 : 07200720072a072a
[    7.000906] x7 : 072a072a072a072a x6 : c0000000ffffefff
[    7.006645] x5 : 0000000000057fa8 x4 : ffffb03267367968
[    7.011813] x3 : 00000000ffffffff x2 : ffffb032657de170
[    7.017407] x1 : ffffb03267352680 x0 : 0000000100010001
[    7.022496] Call trace:
[    7.024945]  tegra194_cbb_err_isr+0x19c/0x1b0
[    7.029506]  __handle_irq_event_percpu+0x68/0x2a0
[    7.034035]  handle_irq_event_percpu+0x40/0xa0
[    7.038242]  handle_irq_event+0x50/0xf0
[    7.042009]  handle_fasteoi_irq+0xc0/0x170
[    7.046035]  generic_handle_irq+0x40/0x60
[    7.050055]  __handle_domain_irq+0x70/0xd0
[    7.054260]  efi_header_end+0xb0/0xf0
[    7.058013]  el1_irq+0xd0/0x180
[    7.061004]  cpuidle_enter_state+0xb8/0x410
[    7.065185]  cpuidle_enter+0x40/0x60
[    7.068690]  call_cpuidle+0x44/0x80
[    7.072185]  do_idle+0x208/0x270
[    7.075341]  cpu_startup_entry+0x2c/0x70
[    7.079380]  rest_init+0xdc/0xe8
[    7.082619]  arch_call_rest_init+0x18/0x20
[    7.086543]  start_kernel+0x514/0x54c
[    7.090148] Code: a9446bf9 a94573fb a8c67bfd d65f03c0 (d4210000)
[    7.096118] ---[ end trace 41aa00d45e1301e8 ]---
[    7.100810] Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt
[    7.108588] SMP: stopping secondary CPUs
[    7.112279] Kernel Offset: 0x303255620000 from 0xffff800010000000
[    7.118559] PHYS_OFFSET: 0xffffd30ac0000000
[    7.122678] CPU features: 0x8240002,03802a30
[    7.126879] Memory Limit: none
[    7.130300] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt ]---

Please upgrade to r35.3.1. It is known issue with display.

We upgraded the L4T version to R35.3.1, however it still occasionally starts Oops, can you tell us what might be the cause?

[    6.651130] CPU:0, Error: cbb-noc@2300000, irq=15
[    6.651322] **************************************
[    6.651506] CPU:0, Error:cbb-noc
[    6.651633]  Error Logger            : 0
[    6.651786]  ErrLog0                 : 0x80030000
[    6.651906]    Transaction Type      : RD  - Read, Incrementing
[    6.652068]    Error Code            : SLV
[    6.652222]    Error Source          : Target
[    6.652325]    Error Description     : Target error detected by CBB slave
[    6.652572]    AXI2APB_5 bridge error: RDFIFOF - Read Response FIFO Full interrupt
[    6.652808]    Packet header Lock    : 0
[    6.652933]    Packet header Len1    : 3
[    6.653081]    NOC protocol version  : version >= 2.7
[    6.653281]  ErrLog1                 : 0x35162a
[    6.656355]  ErrLog2                 : 0x0
[    6.658807]    RouteId               : 0x35162a
[    6.662301]    InitFlow              : ccroc_p2ps/I/ccroc_p2ps
[    6.667032]    Targflow              : host1x_p2pm/T/host1x_p2pm
[    6.672099]    TargSubRange          : 11
[    6.675079]    SeqId                 : 0
[    6.677879]  ErrLog3                 : 0x30124
[    6.681373]  ErrLog4                 : 0x0
[    6.684040]    Address accessed      : 0x155f0124
[    6.688465]  ErrLog5                 : 0xa89f851
[    6.691732]    Non-Modify            : 0x1
[    6.695372]    AXI ID                : 0x15
[    6.698525]    Master ID             : CCPLEX
[    6.701595]    Security Group(GRPSEC): 0x7e
[    6.706051]    Cache                 : 0x1 -- Bufferable
[    6.710251]    Protection            : 0x2 -- Unprivileged, Non-Secure, Data Access
[    6.717075]    FALCONSEC             : 0x0
[    6.720225]    Virtual Queuing Channel(VQC): 0x0
[    6.724862]  **************************************
[    6.729694] ------------[ cut here ]------------
[    6.734410] kernel BUG at drivers/soc/tegra/cbb/tegra194-cbb.c:2057!
[    6.740713] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[    6.745967] Modules linked in:
[    6.749225] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.104-tegra #1
[    6.755840] Hardware name: Unknown NVIDIA Jetson Xavier NX Developer Kit/NVIDIA Jetson Xavier NX Developer Kit, BIOS 3.1-32827747 03/19/2023
[    6.768104] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
[    6.774172] pc : tegra194_cbb_err_isr+0x19c/0x1b0
[    6.778858] lr : tegra194_cbb_err_isr+0x11c/0x1b0
[    6.783611] sp : ffff800010003df0
[    6.787275] x29: ffff800010003df0 x28: 0000000000000001
[    6.792681] x27: 0000000000000080 x26: ffffd497cc39e240
[    6.798106] x25: ffffd497cccfaf68 x24: 0000000000000001
[    6.803541] x23: ffffd497cc687000 x22: ffffd497ccb1ec50
[    6.808885] x21: 000000000000000f x20: 0000000000000005
[    6.814381] x19: ffffd497ccb1ec40 x18: 0000000000000010
[    6.819551] x17: 0000000000000007 x16: 000000000000000e
[    6.824891] x15: ffffd497cc992bf0 x14: 0720072007200720
[    6.830714] x13: 0720072007200720 x12: 0720072007200720
[    6.835750] x11: 0720072007200720 x10: 0720072007200720
[    6.841507] x9 : 0720072007200720 x8 : 07200720072a072a
[    6.846755] x7 : 072a072a072a072a x6 : c0000000ffffefff
[    6.852267] x5 : 0000000000057fa8 x4 : ffffd497cc9a7968
[    6.857949] x3 : 00000000ffffffff x2 : ffffd497cae0e170
[    6.863286] x1 : ffffd497cc992680 x0 : 0000000100010001
[    6.868627] Call trace:
[    6.871093]  tegra194_cbb_err_isr+0x19c/0x1b0
[    6.875134]  __handle_irq_event_percpu+0x68/0x2a0
[    6.879832]  handle_irq_event_percpu+0x40/0xa0
[    6.884377]  handle_irq_event+0x50/0xf0
[    6.887913]  handle_fasteoi_irq+0xc0/0x170
[    6.891908]  generic_handle_irq+0x40/0x60
[    6.895930]  __handle_domain_irq+0x70/0xd0
[    6.900396]  efi_header_end+0xb0/0xf0
[    6.903890]  el1_irq+0xd0/0x180
[    6.906624]  cpuidle_enter_state+0xb8/0x410
[    6.910806]  cpuidle_enter+0x40/0x60
[    6.914312]  call_cpuidle+0x44/0x80
[    6.917807]  do_idle+0x208/0x270
[    6.921475]  cpu_startup_entry+0x30/0x70
[    6.925270]  rest_init+0xdc/0xe8
[    6.928238]  arch_call_rest_init+0x18/0x20
[    6.932419]  start_kernel+0x514/0x54c
[    6.935773] Code: a9446bf9 a94573fb a8c67bfd d65f03c0 (d4210000)
[    6.942012] ---[ end trace c6f72bf3142fa518 ]---
[    6.946684] Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt
[    6.954207] SMP: stopping secondary CPUs
[    6.958415] Kernel Offset: 0x5497bac50000 from 0xffff800010000000
[    6.964437] PHYS_OFFSET: 0xffffee5900000000
[    6.968298] CPU features: 0x8240002,03802a30
[    6.973012] Memory Limit: none
[    6.976199] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt ]---

We designed our own carrier board, I wonder if this will affect the start-up of the Xavier NX core module?

可以參考一下這條issue到comment #13之前的討論 #13之後就是別的問題了

We added bootloader-status= disabled, but the problem still occurs!

請問你開機的時候還看得到開機logo嗎

建議你還是把完整log貼上來吧… 通常這種狀況都是

  1. 你的patch沒有正確打上去
  2. 你可能碰到別的問題

不論是哪種, 我們都需要確認log.

1 Like

success.log (191.8 KB)
failed.log (84.5 KB)

您好,
现在已经看不见开机logo了。启动偶尔会Oops,成功和失败的log详见上述两个附件。

Hi @newbie.lei

可以請你把完整的device tee從dtb轉回dts之後貼上來嗎
這個問題發生的register有點奇怪… 0x155f0124 是dpaux3. 這個在預設bsp應該是當作一般的i2c使用
所以這個問題跟display應該無關

也想順便請問一下你們板子上i2c pin的使用情況

dt (388.4 KB)

dtb反编译得到的dts见上述附件。

板子用到了两组I2C,具体如下

一组用作HDMI的EDID获取通道;一组用来PCIe网卡的SMBUS通道。

可以嘗試把這個關掉再測試看看嗎

dpaux@155F0000 {
			status = "okay";

您好,按照上述disable后,

10776         dpaux@155F0000 {
10777             status = "disabled";
10778             compatible = "nvidia,tegra194-dpaux3-padctl";
10779             reg = <0x0 0x155f0000 0x0 0x10000>;
10780             interrupts = <0x0 0xf6 0x4>;
10781             nvidia,dpaux-ctrlnum = <0x3>;
10782             clocks = <0x4 0xba>;
10783             clock-names = "dpaux3";
10784             resets = <0x4 0xb>;
10785             reset-names = "dpaux3";
10786             phandle = <0x264>;
10787
10788             prod-settings {
10789                 #prod-cells = <0x4>;
10790
10791                 prod_c_dpaux_hdmi {
10792                     prod = <0x0 0x124 0x37fc 0x700>;
10793                 };
10794
10795                 prod_c_dpaux_dp {
10796                     prod = <0x0 0x124 0x37fe 0x24b2>;
10797                 };
10798             };
10799
10800             pinmux@0 {
10801                 phandle = <0x28>;
10802
10803                 dpaux3_pins {
10804                     pins = "dpaux3-3";
10805                     function = "i2c";
10806                 };
10807             };
10808         };

启动了10多次均正常,按照我的理解,dpaux3是用作DP AUX CH3的,但是我们并未使用sor3,为什么会有概率导致启动时出现Oops呢?
这个I2C通道为什么会触发内核Oops,能否麻烦您帮忙简要的给我们描述一下原因?

其實這個問題我們是第一次有用戶回報. 所以我目前也挺好奇發生什麼事情

請問一下這個問題在Xavier NX devkit上面可以複製到嗎 ? 如果可以的話 大概要reboot多少次才能看到一次?

目前在devkit载板上无法复现;但是我们自己设计的两块载板都会出现这样的问题,不知道是否是 上电时序 or 电磁干扰导致的?!

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Is this still an issue to support? Any result can be shared? Thanks