NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out

英伟达团队您好,在使用贵公司产品遇到了网络问题,在此请教一下思路!
模组类型:orin nano
系统版本:Jetson Linux 35.3.1(JetPack 5.1.1)
载板类型:自制底板
版本来源:Jetson Linux 35.3.1 | NVIDIA Developer
使用的网口为eth1,大致信息:
image
网口连接方式: 和另外一个板子网口直连(基本不受外部影响)
出现的问题:一直运行自己的服务同时ping也一直运行,偶尔会出现下载失败(orin nano通过网络下载另外一个板子上的图片),ping也有偶尔丢包
模组orin nano下载失败及ping丢包如下:
从时间来看,下载失败时,此时ping也会丢包



Orin nano上查看dmesg信息看到如下报错:

其他说明:该环境原来使用的是模组xaiver nx使用jetpack 5.1.1,同样的服务,未出现网络丢包情况;在该环境上只是更换了orin nano(使用jetpack 5.1.1),出现了ping丢包情况!
问题:该载板原来使用xaiver nx没有任何问题,现在使用orin nano出现的网络丢包情况, xavier nx和orin nano是可以兼容的吗?在原来xaviver nx的载板上,使用orin nano如何修改,才能解决该丢包问题?

這個問題我們有針對r8169driver更新, 但由於patch太過龐大, 只能建議你直接升到jetpack5最新版本.

感謝支持!我們這裏爲了改動最小,仍然使用了jetpack5.1.1;但刪除了r8169驅動,使用了r8168驅動,目前測試了19個小時,沒有下載失敗的情況!大致操作步驟:

  1. 刪除r8169驅動模塊
    cd /lib/modules/5.10.104-tegra/kernel/drivers/net/ethernet/realtek
    rm r8169.ko
  2. 重啓板卡,ethtool查看,使用了r8168驅動
    image
  3. 測試結果
    image

我们遇到了相同的问题,使用的模组为Orin NX 8g版本,自制底板。系统为Jetson Linux ubuntu 5.10.120-tegra。网卡驱动为r8168的8.051.02-NAPI版本。出现网络断连异常时的系统日志如下:
Sep 23 17:59:19 ubuntu kernel: [26331.835328] ------------[ cut here ]------------
Sep 23 17:59:19 ubuntu kernel: [26331.840115] NETDEV WATCHDOG: eth0 (r8168): transmit queue 0 timed out
Sep 23 17:59:19 ubuntu kernel: [26331.840144] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:467 dev_watchdog+0x3b4/0x3c0
Sep 23 17:59:19 ubuntu kernel: [26331.848650] Modules linked in: veth nvidia_modeset(O) fuse xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter br_netfilter mttcan can_raw can_dev can lzo_rle lzo_compress zram overlay ramoops reed_solomon loop snd_soc_tegra186_dspk snd_soc_tegra186_asrc snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_i2s snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_sfc snd_soc_tegra210_admaif snd_soc_tegra_pcm aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce sha256_arm64 sha1_ce snd_soc_spdif_tx snd_soc_tegra_machine_driver userspace_alert snd_soc_tegra210_adsp snd_soc_tegra_utils snd_soc_tegra210_ahub tegra_bpmp_thermal fusb301 snd_soc_simple_card_utils snd_hda_codec_hdmi nvadsp tegra210_adma snd_hda_tegra r8168 snd_hda_codec snd_hda_core
Sep 23 17:59:19 ubuntu kernel: [26331.848730] nv_imx219 spi_tegra114 nvidia(O) binfmt_misc ina3221 pwm_fan nvgpu nvmap ip_tables x_tables [last unloaded: mtd]
Sep 23 17:59:19 ubuntu kernel: [26331.848747] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G O 5.10.120-tegra #1
Sep 23 17:59:19 ubuntu kernel: [26331.848749] Hardware name: Unknown NVIDIA Orin NX Developer Kit/NVIDIA Orin NX Developer Kit, BIOS 4.1-33958178 08/01/2023
Sep 23 17:59:19 ubuntu kernel: [26331.848752] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=–)
Sep 23 17:59:19 ubuntu kernel: [26331.848754] pc : dev_watchdog+0x3b4/0x3c0
Sep 23 17:59:19 ubuntu kernel: [26331.848755] lr : dev_watchdog+0x3b4/0x3c0
Sep 23 17:59:19 ubuntu kernel: [26331.848756] sp : ffff800010013d50
Sep 23 17:59:19 ubuntu kernel: [26331.848757] x29: ffff800010013d50 x28: 0000000000000001
Sep 23 17:59:19 ubuntu kernel: [26331.848759] x27: 0000000000000004 x26: 0000000000000140
Sep 23 17:59:19 ubuntu kernel: [26331.848761] x25: ffff00104ba2c280 x24: 00000000ffffffff
Sep 23 17:59:19 ubuntu kernel: [26331.848764] x23: ffff0010508583dc x22: ffffa06092736000
Sep 23 17:59:19 ubuntu kernel: [26331.848766] x21: ffff001050858000 x20: ffff001050858480
Sep 23 17:59:19 ubuntu kernel: [26331.848768] x19: 0000000000000000 x18: 0000000000000000
Sep 23 17:59:19 ubuntu kernel: [26331.848770] x17: 0000000000000000 x16: ffffa06090a35220
Sep 23 17:59:19 ubuntu kernel: [26331.848772] x15: ffff0010401d4070 x14: ffffffffffffffff
Sep 23 17:59:19 ubuntu kernel: [26331.848774] x13: ffffa06092a58de8 x12: ffffa06092a58a38
Sep 23 17:59:19 ubuntu kernel: [26331.848776] x11: 000100000000005c x10: 0000000000000001
Sep 23 17:59:19 ubuntu kernel: [26331.848778] x9 : 00000000fffffffe x8 : 2030206575657571
Sep 23 17:59:19 ubuntu kernel: [26331.848781] x7 : 2074696d736e6172 x6 : c0000000ffffefff
Sep 23 17:59:19 ubuntu kernel: [26331.848783] x5 : ffff0011ae7f6958 x4 : ffffa06092757a48
Sep 23 17:59:19 ubuntu kernel: [26331.848785] x3 : 0000000000000001 x2 : ffff0011ae7f6960
Sep 23 17:59:19 ubuntu kernel: [26331.848787] x1 : 0000000000000000 x0 : 0000000000000000
Sep 23 17:59:19 ubuntu kernel: [26331.848789] Call trace:
Sep 23 17:59:19 ubuntu kernel: [26331.848792] dev_watchdog+0x3b4/0x3c0
Sep 23 17:59:19 ubuntu kernel: [26331.848796] call_timer_fn+0x3c/0x200
Sep 23 17:59:19 ubuntu kernel: [26331.848798] run_timer_softirq+0x50c/0x5e0
Sep 23 17:59:19 ubuntu kernel: [26331.848801] __do_softirq+0x140/0x3e8
Sep 23 17:59:19 ubuntu kernel: [26331.848807] irq_exit+0xc0/0xe0
Sep 23 17:59:19 ubuntu kernel: [26331.848812] __handle_domain_irq+0x74/0xd0
Sep 23 17:59:19 ubuntu kernel: [26331.848813] gic_handle_irq+0x68/0x134
Sep 23 17:59:19 ubuntu kernel: [26331.848815] el1_irq+0xd0/0x180
Sep 23 17:59:19 ubuntu kernel: [26331.848820] cpuidle_enter_state+0xb8/0x410
Sep 23 17:59:19 ubuntu kernel: [26331.848822] cpuidle_enter+0x40/0x60
Sep 23 17:59:19 ubuntu kernel: [26331.848825] call_cpuidle+0x44/0x80
Sep 23 17:59:19 ubuntu kernel: [26331.848826] do_idle+0x208/0x270
Sep 23 17:59:19 ubuntu kernel: [26331.848827] cpu_startup_entry+0x30/0x70
Sep 23 17:59:19 ubuntu kernel: [26331.848830] secondary_start_kernel+0x14c/0x170
Sep 23 17:59:19 ubuntu kernel: [26331.848832] —[ end trace ec14e26d89578865 ]—
Sep 23 17:59:23 ubuntu kernel: [26335.654138] r8168: eth0: link up
Sep 23 17:59:23 ubuntu NetworkManager[596]: [1727085563.2699] device (eth0): carrier: link connected
尝试过更新r8168驱动到最新版的8.053.00-NAPI版本仍未能解决问题。而且故障出现的频率较高,从每天几次到近百次不等,网络流量越大出现故障越频繁。

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.