I’m using an AGX Xavier with L4T 32.6.1 and running into a kernel panic on the eqos ethernet driver (eth0). I have a GigE camera connected to this interface configured to send jumbo packets at 4000 byte packet size. Unfortunately, the kernel panic occurs after a random amount of time and I can only catch the dmesg output sometimes before the system reboots (kern.log and syslog don’t contain the error, probably due to rebooting before file write flush). See the output I was able to capture below.
[ 182.428773] ------------[ cut here ]------------
[ 182.428786] kernel BUG at /root/trunk_t186_t194_32.6.1/Linux_for_Tegra/sources/kernel/kernel-4.9/mm/slub.c:3919!
[ 182.429008] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 182.429132] Modules linked in: xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter can_raw can mttcan can_dev overlay zram userspace_alert nvgpu nfsd nfs_acl ip_tables x_tables
[ 182.429895] CPU: 2 PID: 8660 Comm: arv_gv_stream Not tainted 4.9.253-tegra #1
[ 182.430061] Hardware name: jetson-xavier (DT)
[ 182.430116] ------------[ cut here ]------------
[ 182.430134] WARNING: CPU: 0 PID: 3 at /root/trunk_t186_t194_32.6.1/Linux_for_Tegra/sources/kernel/nvidia/drivers/net/ethernet/nvidia/eqos/desc.c:387 desc_alloc_skb.isra.6+0x13c/0x1c8
[ 182.430182] Modules linked in: xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter can_raw can mttcan can_dev overlay zram userspace_alert nvgpu nfsd nfs_acl ip_tables x_tables
[ 182.430197] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.253-tegra #1
[ 182.430199] Hardware name: jetson-xavier (DT)
[ 182.430208] task: ffffffc7dc771c00 task.stack: ffffffc7dbc14000
[ 182.430219] PC is at desc_alloc_skb.isra.6+0x13c/0x1c8
[ 182.430222] LR is at eqos_re_alloc_skb+0x68/0x108
[ 182.430225] pc : [<ffffff800891e4e4>] lr : [<ffffff800891e938>] pstate: 20c00045
[ 182.430227] sp : ffffffc7dbc17b60
[ 182.430242] x29: ffffffc7dbc17b60 x28: ffffffc7d7b08900
[ 182.430247] x27: ffffffc7d7b0c000 x26: ffffffc7c7fb4900
[ 182.430252] x25: 0000000002080020 x24: 0000000000000000
[ 182.430257] x23: 000000005dd02042 x22: ffffffc7c7fb4848
[ 182.430261] x21: ffffffc7c7fb4840 x20: ffffffc7d7b08900
[ 182.430266] x19: ffffffc7ccb89c00 x18: 0000000000000400
[ 182.430270] x17: 0000000000000002 x16: 0000000000000003
[ 182.430275] x15: ffffffc7db23f028 x14: 0000000000000001
[ 182.430279] x13: 0000000000000000 x12: 0000000000544846
[ 182.430284] x11: ffffff80091b45f0 x10: ffffff8009873118
[ 182.430294] x9 : 000000005df94000 x8 : 0000000000000001
[ 182.430298] x7 : 0000000000757dbc x6 : 0000000000000000
[ 182.430302] x5 : 0000000000000000 x4 : 0000000000000000
[ 182.430307] x3 : 0000000002080020 x2 : ffffffc7c7fb4848
[ 182.430311] x1 : ffffffc7c7fb4840 x0 : 0000000000000f84
[ 182.430314] ---[ end trace c0853cce0ae8af66 ]---
[ 182.430317] Call trace:
[ 182.430325] [<ffffff800891e4e4>] desc_alloc_skb.isra.6+0x13c/0x1c8
[ 182.430330] [<ffffff800891e938>] eqos_re_alloc_skb+0x68/0x108
[ 182.430334] [<ffffff8008919974>] eqos_napi_poll_rx+0x2dc/0x4f8
[ 182.430351] [<ffffff8008d989e4>] net_rx_action+0xf4/0x358
[ 182.430360] [<ffffff8008081054>] __do_softirq+0x13c/0x3b0
[ 182.430365] [<ffffff80080b9db0>] run_ksoftirqd+0x48/0x58
[ 182.430370] [<ffffff80080dfa38>] smpboot_thread_fn+0x160/0x248
[ 182.430374] [<ffffff80080db09c>] kthread+0xec/0xf0
[ 182.430377] [<ffffff80080838a0>] ret_from_fork+0x10/0x30
[ 182.437481] ------------[ cut here ]------------
[ 182.437488] kernel BUG at /root/trunk_t186_t194_32.6.1/Linux_for_Tegra/sources/kernel/kernel-4.9/net/core/skbuff.c:1444!
[ 182.637287] task: ffffffc78ddeb800 task.stack: ffffffc7cb6b0000
[ 182.643149] PC is at kfree+0x254/0x2a8
[ 182.646994] LR is at skb_free_head+0x28/0x48
[ 182.651197] pc : [<ffffff8008232e4c>] lr : [<ffffff8008d7e6b0>] pstate: 40400145
[ 182.658633] sp : ffffffc7cb6b3b70
[ 182.661958] x29: ffffffc7cb6b3b70 x28: ffffffc634925f00
[ 182.667903] x27: 0000000000000f84 x26: 0000000000000f84
[ 182.673501] x25: 0000000000000000 x24: 0000000000000000
[ 182.679103] x23: 0000000000000040 x22: ffffffc6122aa000
[ 182.684701] x21: ffffffc634925f00 x20: ffffff8008d7e6b0
[ 182.690049] x19: ffffffbf1848aa80 x18: 000000000000032d
[ 182.695990] x17: 0000007f48c2b030 x16: ffffff8008d775d8
[ 182.701764] x15: 00001106f0000000 x14: 60ee826b07835afb
[ 182.707452] x13: 8065e9946475746f x12: 8185712f775f6d80
[ 182.712964] x11: 59be9165b1786680 x10: 8169327c681d8069
[ 182.718565] x9 : a18c67d27c682878 x8 : 68ea7c65c47e6a62
[ 182.724340] x7 : 000000000007d653 x6 : 0000007e60001aa4
[ 182.729853] x5 : 0000007e60001aa4 x4 : 0000000000000004
[ 182.734943] x3 : ffffffc6122aa000 x2 : 0000000000001ec0
[ 182.740276] x1 : 0000000000000000 x0 : 0000000000000000
[ 182.747023] Process arv_gv_stream (pid: 8660, stack limit = 0xffffffc7cb6b0000)
[ 182.754179] Call trace:
[ 182.756547] [<ffffff8008232e4c>] kfree+0x254/0x2a8
[ 182.761097] [<ffffff8008d7e6b0>] skb_free_head+0x28/0x48
[ 182.765911] [<ffffff8008d7efc8>] skb_release_data+0x100/0x130
[ 182.771247] [<ffffff8008d7f028>] skb_release_all+0x30/0x40
[ 182.776322] [<ffffff8008d7f058>] __kfree_skb+0x20/0x38
[ 182.781395] [<ffffff8008d874c0>] __skb_free_datagram_locked+0x90/0x118
[ 182.787260] [<ffffff8008e18314>] udp_recvmsg+0x354/0x630
[ 182.792337] [<ffffff8008e25724>] inet_recvmsg+0xb4/0xd8
[ 182.797407] [<ffffff8008d74a30>] sock_recvmsg+0x58/0x68
[ 182.802223] [<ffffff8008d77680>] SyS_recvfrom+0xa8/0x120
[ 182.807300] [<ffffff8008083900>] el0_svc_naked+0x34/0x38
[ 182.812465] ---[ end trace c0853cce0ae8af67 ]---
[ 182.825242] Internal error: Oops - BUG: 0 [#2] PREEMPT SMP
Interestingly, reducing the packet size sent by the camera to non-jumbo packets (I tried 1250 bytes) causes a kernel panic to occur more consistently over time, usually within a minute or two.
[ 241.266669] ------------[ cut here ]------------
[ 241.266678] kernel BUG at /root/trunk_t186_t194_32.6.1/Linux_for_Tegra/sources/kernel/kernel-4.9/mm/slub.c:3919!
[ 241.266878] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 241.266978] Modules linked in: can_raw can mttcan can_dev xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter zram overlay userspace_alert nvgpu nfsd nfs_acl ip_tables x_tables
[ 241.267611] CPU: 3 PID: 10337 Comm: arv_gv_stream Not tainted 4.9.253-tegra #1
[ 241.267730] Hardware name: jetson-xavier (DT)
[ 241.267809] task: ffffffc79a47f000 task.stack: ffffffc79a554000
[ 241.267914] PC is at kfree+0x254/0x2a8
[ 241.267985] LR is at skb_free_head+0x28/0x48
[ 241.268195] pc : [<ffffff8008232e4c>] lr : [<ffffff8008d7e6b0>] pstate: 40400145
[ 241.268758] sp : ffffffc79a557b70
[ 241.269021] x29: ffffffc79a557b70 x28: ffffffc7c0647c00
[ 241.269458] x27: 00000000000004c6 x26: 00000000000004c6
[ 241.269889] x25: 0000000000000000 x24: 0000000000000000
[ 241.274719] x23: 0000000000000040
[ 241.275661] cache_from_obj: Wrong slab cache. kmalloc-256 but object is from UDP
[ 241.275678] ------------[ cut here ]------------
[ 241.275702] WARNING: CPU: 1 PID: 10323 at /root/trunk_t186_t194_32.6.1/Linux_for_Tegra/sources/kernel/kernel-4.9/mm/slab.h:354 kmem_cache_free+0x1cc/0x2e0
[ 241.275704] Modules linked in:
[ 241.275713] can_raw
[ 241.275717] can
[ 241.275719] mttcan
[ 241.275725] can_dev
[ 241.275727] xt_conntrack
[ 241.275728] ipt_MASQUERADE
[ 241.275731] nf_nat_masquerade_ipv4
[ 241.275732] nf_conntrack_netlink
[ 241.275734] nfnetlink
[ 241.275736] xt_addrtype
[ 241.275737] iptable_filter
[ 241.275739] iptable_nat
[ 241.275740] nf_conntrack_ipv4
[ 241.275742] nf_defrag_ipv4
[ 241.275743] nf_nat_ipv4
[ 241.275744] nf_nat
[ 241.275746] nf_conntrack
[ 241.275748] br_netfilter
[ 241.275749] zram
[ 241.275751] overlay
[ 241.275753] userspace_alert
Any tips or help are appreciated.