Jetson Orin NX 1 hour boot time

Hello. Yesterday I had a 1 hour long boot sequence. At least that is what it seems like it happened. Seems like NETDEV systemd service timed out, and it took a long time to recover from this. I depend on the hardware to be accessible via SSH, but it did not come online for a whole hour. Here are my systemd logs.

Dec 31 18:00:55 guardian6-desktop kernel: eth0: renamed from veth311616d
Dec 31 18:00:55 guardian6-desktop kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth2bef8bc: link becomes ready
Dec 31 18:01:12 guardian6-desktop kernel: r8168: eth0: link up
Dec 31 18:01:12 guardian6-desktop kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 13 16:14:09 guardian6-desktop kernel: ------------[ cut here ]------------
Mar 13 16:14:09 guardian6-desktop kernel: NETDEV WATCHDOG: eth0 (r8168): transmit queue 0 timed out
Mar 13 16:14:09 guardian6-desktop kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:467 dev_watchdog+0x3d4/0x3e0
Mar 13 16:14:09 guardian6-desktop kernel: Modules linked in: veth xt_nat xt_tcpudp nvidia_modeset(O) xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat nf_conntra>
Mar 13 16:14:09 guardian6-desktop kernel:  nvidia(O) ina3221 pwm_fan nvgpu nvmap ip_tables x_tables [last unloaded: mtd]
Mar 13 16:14:09 guardian6-desktop kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           O      5.10.192-tegra #1
Mar 13 16:14:09 guardian6-desktop kernel: Hardware name: NVIDIA Orin NX Developer Kit (DT)
Mar 13 16:14:09 guardian6-desktop kernel: pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
Mar 13 16:14:09 guardian6-desktop kernel: pc : dev_watchdog+0x3d4/0x3e0
Mar 13 16:14:09 guardian6-desktop kernel: lr : dev_watchdog+0x3d4/0x3e0
Mar 13 16:14:09 guardian6-desktop kernel: sp : ffff800010013d50
Mar 13 16:14:09 guardian6-desktop kernel: x29: ffff800010013d50 x28: 0000000000000001 
Mar 13 16:14:09 guardian6-desktop kernel: x27: 0000000000000004 x26: 0000000000000140 
Mar 13 16:14:09 guardian6-desktop kernel: x25: ffff2d3d8af40680 x24: 00000000ffffffff 
Mar 13 16:14:09 guardian6-desktop kernel: x23: ffff2d3d8d7f83dc x22: ffffbdbdefb86000 
Mar 13 16:14:09 guardian6-desktop kernel: x21: ffff2d3d8d7f8000 x20: ffff2d3d8d7f8480 
Mar 13 16:14:09 guardian6-desktop kernel: x19: 0000000000000000 x18: 0000000000000000 
Mar 13 16:14:09 guardian6-desktop kernel: x17: 0000000000000000 x16: ffffbdbdede850f0 
Mar 13 16:14:09 guardian6-desktop kernel: x15: ffff2d3d80220570 x14: ffffffffffffffff 
Mar 13 16:14:09 guardian6-desktop kernel: x13: ffffbdbdefea9e28 x12: ffffbdbdefea9a78 
Mar 13 16:14:09 guardian6-desktop kernel: x11: 0480000000002000 x10: 0000000000000001 
Mar 13 16:14:09 guardian6-desktop kernel: x9 : 00000000fffffffe x8 : 756f2064656d6974 
Mar 13 16:14:09 guardian6-desktop kernel: x7 : 2030206575657571 x6 : 0000000000000003 
Mar 13 16:14:09 guardian6-desktop kernel: x5 : 0000000000000000 x4 : 0000000000000000 
Mar 13 16:14:09 guardian6-desktop kernel: x3 : 0000000000000100 x2 : 0000000000000100 
Mar 13 16:14:09 guardian6-desktop kernel: x1 : 0000000000000000 x0 : 0000000000000000 
Mar 13 16:14:09 guardian6-desktop kernel: Call trace:
Mar 13 16:14:09 guardian6-desktop kernel:  dev_watchdog+0x3d4/0x3e0
Mar 13 16:14:09 guardian6-desktop kernel:  call_timer_fn+0x3c/0x200
Mar 13 16:14:09 guardian6-desktop kernel:  run_timer_softirq+0x50c/0x5e0
Mar 13 16:14:09 guardian6-desktop kernel:  __do_softirq+0x140/0x3e8
Mar 13 16:14:09 guardian6-desktop kernel:  irq_exit+0xc0/0xe0
Mar 13 16:14:09 guardian6-desktop kernel:  __handle_domain_irq+0x74/0xd0
Mar 13 16:14:09 guardian6-desktop kernel:  gic_handle_irq+0x68/0x134
Mar 13 16:14:09 guardian6-desktop kernel:  el1_irq+0xd0/0x180
Mar 13 16:14:09 guardian6-desktop kernel:  cpuidle_enter_state+0xb8/0x410
Mar 13 16:14:09 guardian6-desktop kernel:  cpuidle_enter+0x40/0x60
Mar 13 16:14:09 guardian6-desktop kernel:  call_cpuidle+0x44/0x80
Mar 13 16:14:09 guardian6-desktop kernel:  do_idle+0x208/0x270
Mar 13 16:14:09 guardian6-desktop kernel:  cpu_startup_entry+0x30/0x60
Mar 13 16:14:09 guardian6-desktop kernel:  secondary_start_kernel+0x15c/0x180
Mar 13 16:14:09 guardian6-desktop kernel: ---[ end trace 42b0a1bedc73f617 ]---
Mar 13 16:14:12 guardian6-desktop kernel: r8168: eth0: link up
Mar 13 16:47:10 guardian6-desktop kernel: r8168: eth0: link down
Mar 13 16:47:13 guardian6-desktop kernel: r8168: eth0: link up
Mar 13 17:00:51 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth2bef8bc) entered disabled state
Mar 13 17:00:51 guardian6-desktop kernel: veth311616d: renamed from eth0
Mar 13 17:00:52 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth2bef8bc) entered disabled state
Mar 13 17:00:52 guardian6-desktop kernel: device veth2bef8bc left promiscuous mode
Mar 13 17:00:52 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth2bef8bc) entered disabled state
Mar 13 17:00:52 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered blocking state
Mar 13 17:00:52 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered disabled state
Mar 13 17:00:52 guardian6-desktop kernel: device vethb1f2f61 entered promiscuous mode
Mar 13 17:00:52 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered blocking state
Mar 13 17:00:52 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered forwarding state
Mar 13 17:00:52 guardian6-desktop kernel: eth0: renamed from vethc574009
Mar 13 17:00:52 guardian6-desktop kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb1f2f61: link becomes ready
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered disabled state
Mar 13 17:01:06 guardian6-desktop kernel: vethc574009: renamed from eth0
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered disabled state
Mar 13 17:01:06 guardian6-desktop kernel: device vethb1f2f61 left promiscuous mode
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(vethb1f2f61) entered disabled state
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth7a26c7d) entered blocking state
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth7a26c7d) entered disabled state
Mar 13 17:01:06 guardian6-desktop kernel: device veth7a26c7d entered promiscuous mode
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth7a26c7d) entered blocking state
Mar 13 17:01:06 guardian6-desktop kernel: br-dc2aa158fdc6: port 1(veth7a26c7d) entered forwarding state
Mar 13 17:01:07 guardian6-desktop kernel: eth0: renamed from veth481d153
Mar 13 17:01:07 guardian6-desktop kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth7a26c7d: link becomes ready
Mar 13 17:58:29 guardian6-desktop kernel: r8168: eth0: link up
Mar 13 17:59:53 guardian6-desktop kernel: r8168: eth0: link down
Mar 13 17:59:56 guardian6-desktop kernel: r8168: eth0: link up

I would like help understanding what caused this and reducing this downtime. I suspect this is the cause of the long boot time.
kernel: NETDEV WATCHDOG: eth0 (r8168): transmit queue 0 timed out

How to reproduce this issue?

Hello. Help me help you reproduce this issue. All I know right now is the logs I showed, and that the device was powered off by plugging off the power cord.

What can I run that might give other insightful information to help debug the error?

Hello. Is there any commands I could run that will help debug this issue?

No, the only thing you need to find out now is how to stably reproduce this issue and share that method to us.

Ok. Im working ok it.

In the meantime. Do you think it is worth looking into kernel logs or specific services? I got the impression that this was network card related. If so, I would like to look deeper into the logs to dig up the cause – allowing me to reproduce. What module is this? And anything that comes to mind when it comes to corrupted data during a forceful power off?

You could use serial console log to capture logs.

And the card in use is r8168/8169 series from Realtek.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.