Kernel BUG at net/core/skbuff.c:1871!

[ 1003.211755] kernel BUG at net/core/skbuff.c:1871!
[ 1003.216619] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 1003.222265] Modules linked in: fuse(E) nvidia_modeset(OE) xt_conntrack(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) nfnetlink(E) xt_addrtype(E) iptable_filter(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) br_netfilter(E) lzo_rle(E) lzo_compress(E) zram(E) overlay(E) bnep(E) ramoops(E) reed_solomon(E) iwlmvm(E) mac80211(E) tag_ksz(E) snd_soc_tegra186_dspk(E) snd_soc_tegra186_asrc(E) ksz9477_spi(E) snd_soc_tegra210_iqc(E) snd_soc_tegra210_ope(E) snd_soc_tegra186_arad(E) snd_soc_tegra210_mvc(E) snd_soc_tegra210_afc(E) ksz9477(E) aes_ce_blk(E) snd_soc_tegra210_dmic(E) crypto_simd(E) ksz_common(E) snd_soc_tegra210_adx(E) cryptd(E) mcp251xfd dsa_core(E) ftdi_sio(E) aes_ce_cipher(E) iwlwifi(E) snd_soc_tegra210_amx(E) btusb(E) ghash_ce(E) usbserial(E) btrtl(E) sha2_ce(E) snd_soc_tegra210_adsp(E) snd_soc_tegra210_admaif(E) snd_soc_tegra210_i2s(E) can_dev(E) snd_soc_tegra210_sfc(E) snd_soc_tegra210_mixer(E) sha256_arm64(E) btbcm(E) snd_soc_tegra_pcm(E)
[ 1003.222386] snd_soc_tegra_machine_driver(E) sha1_ce(E) snd_hda_codec_hdmi(E) btintel(E) cfg80211(E) snd_soc_tegra_utils(E) nvadsp(E) snd_soc_tegra210_ahub(E) snd_hda_tegra(E) ucsi_ccg(E) nct1008(E) snd_soc_simple_card_utils(E) tegra_bpmp_thermal(E) snd_soc_spdif_tx(E) tegra210_adma(E) snd_hda_codec(E) typec_ucsi(E) userspace_alert(E) typec(E) snd_hda_core(E) snd_soc_rt5640(E) snd_soc_rl6231(E) nvidia(OE) loop(E) spi_tegra114(E) binfmt_misc(E) ina3221(E) pwm_fan(E) nvgpu(E) nvmap(E) ip_tables(E) x_tables(E) [last unloaded: mtd]
[ 1003.362282] CPU: 4 PID: 3022 Comm: sshd Tainted: G W OE 5.10.104-tegra #14
[ 1003.370419] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS 2.1-32413640 01/24/2023
[ 1003.379462] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=–)
[ 1003.385657] pc : skb_put+0x6c/0xc0
[ 1003.389157] lr : skb_put+0x28/0xc0
[ 1003.392654] sp : ffff8000228cb4f0
[ 1003.396062] x29: ffff8000228cb4f0 x28: 0000000000000000
[ 1003.401522] x27: ffff27b548eb16e0 x26: ffff27b5088ae000
[ 1003.406986] x25: ffffadaee0b98998 x24: ffff27b5003e6e80
[ 1003.412448] x23: ffffadaee08bcd40 x22: 0000000000000000
[ 1003.417909] x21: ffffadae94a3c1b0 x20: 0000000000000001
[ 1003.423372] x19: ffff27b548eb16e0 x18: ffff27b5466432c0
[ 1003.428836] x17: 0000000000000000 x16: ffffadaedfdadf20
[ 1003.434296] x15: 00000000fd1b40d3 x14: 0000000000000029
[ 1003.439756] x13: fffffe9ed502a800 x12: 0000000000000000
[ 1003.445220] x11: 0000000000010002 x10: 0000000000000001
[ 1003.450683] x9 : 0000000000000000 x8 : 0000000000000000
[ 1003.456141] x7 : 0001000000010000 x6 : ffff27b53fcda700
[ 1003.461601] x5 : 0000000000000001 x4 : 000000000000ffff
[ 1003.467068] x3 : 0000000000000001 x2 : 0000000000000140
[ 1003.472530] x1 : 0000000000000029 x0 : ffff27b53fcda140
[ 1003.477992] Call trace:
[ 1003.480511] skb_put+0x6c/0xc0
[ 1003.483667] ksz9893_xmit+0x30/0xa0 [tag_ksz]
[ 1003.488165] dsa_slave_xmit+0x114/0x240 [dsa_core]
[ 1003.493097] dev_hard_start_xmit+0x10c/0x330
[ 1003.497487] __dev_queue_xmit+0x83c/0xab0
[ 1003.501606] dev_queue_xmit+0x28/0x40
[ 1003.505369] neigh_resolve_output+0x108/0x1a0
[ 1003.509862] ip_finish_output2+0x15c/0x590
[ 1003.514075] __ip_finish_output+0xf0/0x260
[ 1003.518280] ip_output+0x104/0x1c0
[ 1003.521777] ip_local_out+0x5c/0x70
[ 1003.525372] __ip_queue_xmit+0x144/0x3b0
[ 1003.529402] ip_queue_xmit+0x3c/0x50
[ 1003.533078] __tcp_transmit_skb+0x474/0xb10
[ 1003.537382] tcp_write_xmit+0x38c/0x10d0
[ 1003.541415] __tcp_push_pending_frames+0x5c/0xf0
[ 1003.546164] tcp_push+0x118/0x1c0
[ 1003.549574] tcp_sendmsg_locked+0xb84/0xcc0
[ 1003.553879] tcp_sendmsg+0x44/0x70
[ 1003.557381] inet_sendmsg+0x50/0x80
[ 1003.560983] sock_sendmsg+0x58/0x70
[ 1003.564574] sock_write_iter+0x98/0x100
[ 1003.568524] new_sync_write+0x190/0x1a0
[ 1003.572463] vfs_write+0x25c/0x390
[ 1003.575963] ksys_write+0xf0/0x110
[ 1003.579462] __arm64_sys_write+0x28/0x40
[ 1003.583501] el0_svc_common.constprop.0+0x80/0x1d0
[ 1003.588431] do_el0_svc+0x38/0xb0
[ 1003.591845] el0_svc+0x1c/0x30
[ 1003.594987] el0_sync_handler+0xa8/0xb0
[ 1003.598944] el0_sync+0x16c/0x180
[ 1003.602355] Code: a94153f3 f94013f5 a8c37bfd d65f03c0 (d4210000)
[ 1003.608653] —[ end trace a9061de13ba93950 ]—
[ 1003.618926] Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt
[ 1003.626534] SMP: stopping secondary CPUs
[ 1003.630858] Kernel Offset: 0x2daecee80000 from 0xffff800010000000
[ 1003.637119] PHYS_OFFSET: 0xffffd84c00000000
[ 1003.641424] CPU features: 0x0040006,4a80aa38
[ 1003.645822] Memory Limit: none
[ 1003.654421] —[ end Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt ]—

eth0: ethernet@2310000 {
	nvidia,pause_frames = <0>;
	phy_mode = "rgmii";
	status = "okay";
	nvidia,max-platform-mtu = <16383>;
	fixed-link {                                                                              
		speed = <1000>;
		full-duplex;
	};
};

[ 15.408916] CPU:0, Error: cbb-fabric@0x13a00000, irq=33
[ 15.414365] **************************************
[ 15.419293] CPU:0, Error:cbb-fabric, Errmon:2
[ 15.423770] Error Code : TIMEOUT_ERR
[ 15.427801] Overflow : Multiple TIMEOUT_ERR
[ 15.432467]
[ 15.433995] Error Code : TIMEOUT_ERR
[ 15.438024] MASTER_ID : CCPLEX
[ 15.441505] Address : 0x23de008
[ 15.445093] Cache : 0x1 – Bufferable
[ 15.449385] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 15.456366] Access_Type : Read
[ 15.459861] Access_ID : 0x15
[ 15.459862] Fabric : cbb-fabric
[ 15.466770] Slave_Id : 0x35
[ 15.469999] Burst_length : 0x0
[ 15.473494] Burst_type : 0x1
[ 15.476816] Beat_size : 0x2
[ 15.480045] VQC : 0x0
[ 15.482834] GRPSEC : 0x7e
[ 15.485890] FALCONSEC : 0x0
[ 15.489119] **************************************
[ 15.494151] WARNING: CPU: 0 PID: 0 at drivers/soc/tegra/cbb/tegra234-cbb.c:577 tegra234_cbb_isr+0x130/0x170
[ 15.504343] —[ end trace a9061de13ba9394f ]—
[ 15.629913] using random self ethernet address
[ 15.634597] using random host ethernet address

麻煩每次發問題都要提供以下資訊

  1. Devkit /custom board

  2. 那一版jetpack?

  3. 如何複製問題

1 我们自己做的板子,使用的交换机芯片KSZ9893.
2 Jetson_Linux_R35.2.1_aarch64
3 每次都会出现。
4 相关设备树配置如下:
chosen {
nvidia,ether-mac0 = “48:B0:2D:5D:16:18”;
nvidia,ether-mac = “48:B0:2D:5D:16:18”;
};

eth0: ethernet@2310000 {
	nvidia,pause_frames = <0>;
	phy_mode = "rgmii";
	status = "okay";
	nvidia,max-platform-mtu = <16383>;
	fixed-link {                                                                              
		speed = <1000>;
		full-duplex;
	};
};
spi1: spi@c260000 {
	status = "okay";
	spi-max-frequency = <25000000>;
	ksz9893: ksz9893@0 {
		compatible = "microchip,ksz9893";
		spi-max-frequency = <10000000>;
		//spi-max-frequency = <44000000>;
		reg = <0x0>;
		phy_mode = "rgmii-txid";
		interrupt-parent = <&tegra_main_gpio>;
	    interrupts = <TEGRA234_MAIN_GPIO(G, 4) IRQ_TYPE_LEVEL_LOW>;
		status = "okay";
		ports {
			#address-cells = <1>;
			#size-cells = <0>;				
			port@0 {
				reg = <0>;
				label = "lan1";
			};
			port@1 {
				reg = <1>;
				label = "lan2";
			};
			port@2 {
				reg = <2>;
				lable = "cpu";
				ethernet = <&eth0>;
				fixed-link {
					speed = <1000>;
					full-duplex;
				};
			};
		};
	};	
};

麻煩提供完整的log. 用附件的方式放上來

log.log (430.7 KB)

1689043491177

Some device tree change seems not correct and may not get parsed…
For example, phy-mode = “rgmii-id”

https://docs.nvidia.com/jetson/archives/r35.3.1/DeveloperGuide/text/HR/JetsonModuleAdaptationAndBringUp/JetsonAgxOrinSeries.html?highlight=rgmii#for-rgmii

这个模式应该怎么修改?

我的意思是你的字串都沒有跟文件對上. 感覺driver之前都沒有正確的使用到rgmii mode…

比方說你打的是 “phy_mode” 但是driver在用的是 “phy-mode”.

當然我不保證一定是這個問題, 只是這種明顯有點錯誤的地方還是先修正.

好的,我那按照您的方式修改测试下。

1689054922360
我已经修改了,但是测试结果还是一样的。

您好,内核崩溃什么原因,有什么解决方法吗?

您好,这个问题怎么解决,我需要您协助分析,谢谢。

您好,我应该怎么解决啊。

We don’t have experience with this KSZ9893, have you contact with the vendor to get the support?

联系了,但是目前内核崩溃的代码不是KSZ9893驱动的代码。
还有就是你们芯片启动的时候打印了如下部分:
560] CPU:0, Error: cbb-fabric@0x13a00000, irq=32
[ 16.876986] **************************************
[ 16.881910] CPU:0, Error:cbb-fabric, Errmon:2
[ 16.886386] Error Code : TIMEOUT_ERR
[ 16.890408] Overflow : Multiple TIMEOUT_ERR

[ 16.896595] Error Code : TIMEOUT_ERR
[ 16.900621] MASTER_ID : CCPLEX
[ 16.904113] Address : 0x23de008
[ 16.907698] Cache : 0x1 – Bufferable
[ 16.911998] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 16.918975] Access_Type : Read
[ 16.922469] Access_ID : 0x17
[ 16.922470] Fabric : cbb-fabric
[ 16.929372] Slave_Id : 0x35
[ 16.932600] Burst_length : 0x0
[ 16.936092] Burst_type : 0x1
[ 16.939410] Beat_size : 0x2
[ 16.942640] VQC : 0x0
[ 16.945413] GRPSEC : 0x7e
[ 16.948470] FALCONSEC : 0x0
[ 16.951686] **************************************
[ 16.956704] ------------[ cut here ]------------
[ 16.956716] WARNING: CPU: 0 PID: 321 at drivers/soc/tegra/cbb/tegra234-cbb.c:577 tegra234_cbb_isr+0x130/0x170
不知道是否和内核崩溃有关。

目前看到是kernel BUG at net/core/skbuff.c崩溃了,不太清楚是什么引发了该错误。

您好,有什么分析的方向吗?

您好,有什么分析的问题方向吗?

您好我需要你们的帮助啊。是什么原因引起了你们内核崩溃?