PCIE C7 kernel error problem

NVIDIA工程师:
您好,我遇到了一个pcie的问题。

PCIe C7上是M.2硬盘位。
MGBE、XFI-5G来驱动交换机芯片。
按照NVIDIA手册中显示,这两个似乎是不兼容的?

p3701.conf.common:
ODMDATA=“gbe-uphy-config-22,hsstp-lane-map-3,nvhs-uphy-config-0,hsio-uphy-config-0”;
我这里使用的是config-22

设备树配置:
pcie@141e0000 {
status = “okay”;
num-lanes = <8>;

	phys = <&p2u_gbe_0>, <&p2u_gbe_1>, <&p2u_gbe_2>,
			<&p2u_gbe_3>, <&p2u_gbe_4>, <&p2u_gbe_5>,
			<&p2u_gbe_6>, <&p2u_gbe_7>;

	phy-names = "p2u-0", "p2u-1", "p2u-2", "p2u-3",
			"p2u-4", "p2u-5", "p2u-6", "p2u-7";
}

2024-02-29 09-58-55屏幕截图

内核启动会出现下面的打印错误信息:
但是出现这种错误信息后,我依然可以正常读写访问并挂载M.2硬盘。

这种问题您们有什么建议吗?
[ 10.720405] tegra194-pcie 141e0000.pcie: Using GICv2m MSI allocator
[ 10.794065] CPU:0, Error: cbb-fabric@0x13a00000, irq=34
[ 10.996001] **************************************
[ 11.001640] CPU:0, Error:cbb-fabric, Errmon:2
[ 11.006830] Error Code : TIMEOUT_ERR
[ 11.011569] Overflow : Multiple TIMEOUT_ERR
[ 11.016936]
[ 11.019161] Error Code : TIMEOUT_ERR
[ 11.023883] MASTER_ID : CCPLEX
[ 11.028070] Address : 0x3f60078
[ 11.032339] Cache : 0x1 – Bufferable
[ 11.037320] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 11.044986] Access_Type : Read
[ 11.049154] Access_ID : 0x14
[ 11.049155] Fabric : cbb-fabric
[ 11.057387] Slave_Id : 0x33
[ 11.061268] Burst_length : 0x0
[ 11.065413] Burst_type : 0x1
[ 11.069375] Beat_size : 0x2
[ 11.073246] VQC : 0x0
[ 11.076662] GRPSEC : 0x7e
[ 11.080345] FALCONSEC : 0x0
[ 11.084200] **************************************
[ 11.089854] ------------[ cut here ]------------
[ 11.095232] WARNING: CPU: 0 PID: 0 at drivers/soc/tegra/cbb/tegra234-cbb.c:577 tegra234_cbb_isr+0x130/0x170
[ 11.105877] Modules linked in:
[ 11.109660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.120-tegra #65
[ 11.117198] Hardware name: Unknown Jetson AGX Orin Developer Kit/Jetson AGX Orin Developer Kit, BIOS 4.1-33958178 08/01/2023
[ 11.129398] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=–)
[ 11.136247] pc : tegra234_cbb_isr+0x130/0x170
[ 11.141384] lr : tegra234_cbb_isr+0x10c/0x170
[ 11.146510] sp : ffff800010003e10
[ 11.150571] x29: ffff800010003e10 x28: ffffc34477612700
[ 11.156664] x27: 0000000000000001 x26: 0000000000000080
[ 11.162748] x25: ffffc3447702e9c8 x24: ffffc3447797ae40
[ 11.168826] x23: ffffc34477317008 x22: 0000000000000022
[ 11.174884] x21: ffffc3447779f548 x20: 0000000000000002
[ 11.180926] x19: ffffc3447779f538 x18: 0000000000000000
[ 11.186945] x17: 0000000000aaaaaa x16: 0000000000000000
[ 11.192953] x15: 0000000000002000 x14: 00000000ffffffff
[ 11.198956] x13: 0000000000000001 x12: 0000000000000001
[ 11.204953] x11: ffffc34476beb7c8 x10: 0000000000000000
[ 11.210947] x9 : 0000000000000008 x8 : ffff80002a860020
[ 11.216949] x7 : 0000000000000001 x6 : c0000000ffffefff
[ 11.222953] x5 : 0000000000057fa8 x4 : ffffc34477627ac8
[ 11.228945] x3 : 0000000000000000 x2 : ffffc34475aa0d70
[ 11.234931] x1 : ffffc34477612700 x0 : 0000000100010001
[ 11.240916] Call trace:
[ 11.243937] tegra234_cbb_isr+0x130/0x170
[ 11.248590] __handle_irq_event_percpu+0x68/0x2a0
[ 11.253937] handle_irq_event_percpu+0x40/0xa0
[ 11.259029] handle_irq_event+0x50/0xf0
[ 11.263485] handle_fasteoi_irq+0xc0/0x170
[ 11.268205] generic_handle_irq+0x40/0x60
[ 11.272828] __handle_domain_irq+0x70/0xd0
[ 11.277537] gic_handle_irq+0x68/0x134
[ 11.281896] el1_irq+0xd0/0x180
[ 11.285604] cpuidle_enter_state+0xb8/0x410
[ 11.290397] cpuidle_enter+0x40/0x60
[ 11.294567] call_cpuidle+0x44/0x80
[ 11.298642] do_idle+0x208/0x270
[ 11.302445] cpu_startup_entry+0x2c/0x70
[ 11.306957] rest_init+0xdc/0xe8
[ 11.310758] arch_call_rest_init+0x18/0x20
[ 11.315442] start_kernel+0x500/0x538
[ 11.319679] —[ end trace 44a0e18278d320aa ]—
[ 11.324924] tegra194-pcie 141e0000.pcie: host bridge /pcie@141e0000 ranges:

这是详细的log信息:
errlog.txt (91.1 KB)

你要使用pcie的話你應該要用 gbe-uphy-config-0.

但是 gbe-uphy-config-0我就没办法使用MGBE了

這部份本來就只能挑一個用

按照实际测试结果来看:
我设置了gbe-uphy-config-22,会出现上面的打印错误,但是pcie依然可以访问M.2硬盘。这个错误打印好像并不会影响pcie c7? 因为我的m2只用到了lan0-lan3

同时我在手册看到这一项,把PCIE修改为x4是不是可以消除掉这个错误打印?
是否可以设置?如果可以的话在哪里设置?

沒辦法. 我們能支援的config就只有上面兩種

22这一个config就是pcie x4 C7 + MGBE

这一个配置可以满足我的需求,但是会出现内核打印错误。

我把设备树改成下面这种并没有解决问题。
pcie@141e0000 {
status = “okay”;
num-lanes = <4>;

	phys = <&p2u_gbe_0>, <&p2u_gbe_1>, <&p2u_gbe_2>,
			<&p2u_gbe_3>;

	phy-names = "p2u-0", "p2u-1", "p2u-2", "p2u-3";
};

我這邊解釋一下…

目前的狀況是這樣 我們能支援的就只有這個表格上的x8或是MGBE

SoC本身有其他configuration能動 但是我們軟體沒有支援
你或許可以亂設定一個能動的東西, 但是對於那些碰上的error我們沒有辦法幫你.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.