As part of our product’s successful release, we need to test crash(8) utility on NVIDIA products. So, I flashed a TX2 with R32.1 (JP 4.2), built vmlinux and System.map files from the kernel source.
Installing and running crash(8) utility on the vmlinux and System.map files, I get below BUG and crash(8) exits. This is occurring only on TX2 (and may be TX2i, didn’t test) as the Xavier and NANO that I have tested worked as expected.
Please let me know what could be the reason for this bug.
root@matrix2:~# uname -a
Linux matrix2 4.9.140-tegra #1 SMP PREEMPT Wed Mar 13 00:30:11 PDT 2019 aarch64 aarch64 aarch64 GNU/Linux
root@matrix2:~#
root@matrix2:~# /usr/bin/crash /boot/vmlinux-4.9.140-tegra /boot/System.map-4.9.140-tegra
crash 7.2.1
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
please wait... (patching 130894 gdb minimal_symbol values) [ 1368.350058] usercopy: kernel memory exposure attempt detected from ffffffc000f2c158 (<linear kernel text>) (3752 bytes)
[ 1368.360926] ------------[ cut here ]------------
[ 1368.365539] kernel BUG at /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/mm/usercopy.c:75!
[ 1368.374072] Internal error: Oops - BUG: 0 [#2] PREEMPT SMP
[ 1368.379552] Modules linked in: overlay nvs_bmi160 nvs bcmdhd cfg80211 binfmt_misc nvgpu nfsd nfs_acl bluedroid_pm ip_tables x_tables
[ 1368.391620] CPU: 2 PID: 8263 Comm: crash Tainted: G D 4.9.140-tegra #1
[ 1368.399265] Hardware name: quill (DT)
[ 1368.402923] task: ffffffc1b9c3d400 task.stack: ffffffc1ea7bc000
[ 1368.408858] PC is at __check_object_size+0x124/0x244
[ 1368.413829] LR is at __check_object_size+0x124/0x244
[ 1368.418787] pc : [<ffffff8008257934>] lr : [<ffffff8008257934>] pstate: 40400145
[ 1368.426169] sp : ffffffc1ea7bfd10
[ 1368.429480] x29: ffffffc1ea7bfd10 x28: ffffffc1b9c3d400
[ 1368.434809] x27: ffffff8009512000 x26: 0000000000001000
[ 1368.440136] x25: 0000000000000000 x24: 0000005556b0cc50
[ 1368.445460] x23: 0000000080f2c158 x22: 0000000000000001
[ 1368.450786] x21: ffffffc000f2d000 x20: 0000000000000ea8
[ 1368.456114] x19: ffffffc000f2c158 x18: 0000000000000010
[ 1368.461439] x17: 0000000000000000 x16: 0000000000000000
[ 1368.466766] x15: ffffffffffffffff x14: 6465746365746564
[ 1368.472091] x13: 2074706d65747461 x12: 20657275736f7078
[ 1368.477415] x11: 652079726f6d656d x10: 00000000000003d8
[ 1368.482741] x9 : 3a79706f63726573 x8 : ffffff80083d2fe8
[ 1368.488066] x7 : ffffff8009e740d8 x6 : ffffffc1f67a8bf0
[ 1368.493391] x5 : ffffffc1f67a8bf0 x4 : 0000000000000000
[ 1368.498717] x3 : ffffffc1f67ac7f8 x2 : ffffffc1f67a8bf0
[ 1368.504042] x1 : ffffffc1b9c3d400 x0 : 000000000000006b
[ 1368.509367]
[ 1368.510858] Process crash (pid: 8263, stack limit = 0xffffffc1ea7bc000)
[ 1368.517461] Call trace:
[ 1368.519922] [<ffffff8008257934>] __check_object_size+0x124/0x244
[ 1368.525936] [<ffffff800870380c>] read_mem+0x8c/0x158
[ 1368.530908] [<ffffff800825aa48>] __vfs_read+0x48/0x110
[ 1368.536053] [<ffffff800825b9e4>] vfs_read+0x94/0x150
[ 1368.541019] [<ffffff800825d0d4>] SyS_read+0x54/0xb0
[ 1368.545892] [<ffffff80080838c0>] el0_svc_naked+0x34/0x38
[ 1368.551198] ---[ end trace 46120143800e1a3f ]---
Segmentation fault
root@matrix2:~#
I built one of our kernels without the CONFIG_HARDENED_USERCOPY which is causing this error. After running the crash(8) command with ‘–minimal’ option:
This shows that the crash(8) blows up when it tries to get read-only kernel text memory that kernel thinks should not be allowed. I think somewhere in the arch/arm64/kernel/vmlinux.ld.S linker code needs to be (re)adjusted a little bit so that utilities like crash(8) can work with the CONFIG_HARDENED_USERCOPY.
I am trying to find more info on this but this is what I have for now.
Currently, I am not compiling modules, but is it required?
I will try to compile the Image with LOCALVERSION=-tegra tomorrow and get back to you.
On a side note, is it really necessary to compile kernel with “-tegra” as part of the version name? We compile kernels without “-tegra” in the version name, but didn’t see any obvious issues either with module loading or general functionality. Do you have a list of any such modules which look for “-tegra” in the version name?
I have built NVIDIA kernel with LOCALVERSION=“-tegra”, but crash(8) still doesn’t work. Below is the output.
root@matrix2:~# /usr/bin/crash
crash 7.2.1
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
[ 153.865597] usercopy: kernel memory exposure attempt detected from ffffffc000f5c168 (<linear kernel text>) (3736 bytes)
[ 153.876951] ------------[ cut here ]------------
[ 153.881625] kernel BUG at mm/usercopy.c:75!
[ 153.885876] Internal error: Oops - BUG: 0 [#2] PREEMPT SMP
[ 153.891418] Modules linked in: overlay bcmdhd nvs_bmi160 nvs cfg80211 binfmt_misc nvgpu bluedroid_pm nfsd nfs_acl ip_tables x_tables
[ 153.903990] CPU: 2 PID: 7416 Comm: crash Tainted: G D 4.9.140-tegra #1
[ 153.911683] Hardware name: quill (DT)
[ 153.915449] task: ffffffc1aceaaa00 task.stack: ffffffc1d0028000
[ 153.921474] PC is at __check_object_size+0x124/0x244
[ 153.926481] LR is at __check_object_size+0x124/0x244
[ 153.931499] pc : [<ffffff8008257934>] lr : [<ffffff8008257934>] pstate: 40400045
[ 153.938923] sp : ffffffc1d002bd10
[ 153.942282] x29: ffffffc1d002bd10 x28: ffffffc1aceaaa00
[ 153.947716] x27: ffffff8009503000 x26: 0000000000001000
[ 153.953138] x25: 0000000000000000 x24: 0000005557253d00
[ 153.958548] x23: 0000000080f5c168 x22: 0000000000000001
[ 153.963964] x21: ffffffc000f5d000 x20: 0000000000000e98
[ 153.969379] x19: ffffffc000f5c168 x18: 0000000000000010
[ 153.974787] x17: 0000000000000000 x16: 0000000000000000
[ 153.980199] x15: ffffffffffffffff x14: 6465746365746564
[ 153.985610] x13: 2074706d65747461 x12: 20657275736f7078
[ 153.991036] x11: 652079726f6d656d x10: 0000000000000a20
[ 153.996449] x9 : ffffffc1d002ba00 x8 : ffffffc1aceab480
[ 154.001864] x7 : 0000000000000000 x6 : ffffffc1f67a8bf0
[ 154.007279] x5 : ffffffc1f67a8bf0 x4 : 0000000000000000
[ 154.012691] x3 : ffffffc1f67ac7f8 x2 : ffffffc1f67a8bf0
[ 154.018101] x1 : ffffffc1aceaaa00 x0 : 000000000000006b
[ 154.023515]
[ 154.025058] Process crash (pid: 7416, stack limit = 0xffffffc1d0028000)
[ 154.031709] Call trace:
[ 154.034257] [<ffffff8008257934>] __check_object_size+0x124/0x244
[ 154.040368] [<ffffff800870378c>] read_mem+0x8c/0x158
[ 154.045417] [<ffffff800825aa48>] __vfs_read+0x48/0x110
[ 154.050627] [<ffffff800825b9e4>] vfs_read+0x94/0x150
[ 154.055679] [<ffffff800825d0d4>] SyS_read+0x54/0xb0
[ 154.060652] [<ffffff80080838c0>] el0_svc_naked+0x34/0x38
[ 154.066017] ---[ end trace 0c1aed55ec9fd608 ]---
Segmentation fault
root@matrix2:~#
root@matrix2:~# uname -a
Linux matrix2 4.9.140-tegra #1 SMP PREEMPT Tue Jul 16 08:25:50 EDT 2019 aarch64 aarch64 aarch64 GNU/Linux
root@matrix2:~#
The cross_compiler used is version gcc-7.3.1, however I should clarify that even if we build the kernel in native environment, the issue exists. I have cross built this to make it easier for me to test. The native compiler that’s on a R32.1 flashed system is 7.4.0 (gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1))
Can you please share a list of modules which may be affected by not having “-tegra” in the kernel version? Are these just kernel modules or were you talking about the subsystems or some user-space applications? If so, it will be great to have a list of those.
All board should fail if access to kernel text region is not allowed. Even in Xavier, commands like swap, irq etc crashes the same way as TX2.
Somehow in TX2, launch itself is hitting that issue. Need to be checked though.
By the way what is the benefit in running crash in target? You want to use Jetson to debug dumps from other systems?
Got you. We will update the kernel version going forward.
Thank you for the response.
This is the same doubt that’s bothering me that why is it failing only on TX2 running R32.1 JetPack. Xavier and NANO are working as expected irrespective of what I set CONFIG_HARDENED_USERCOPY to (which I assume is the cause).
We have released RedHawk on TX2’s before with R28.2.x JetPacks and crash(8) worked as expected.
Well, since we release kernels built on top of NVIDIA’s L4T kernel, we ship an slightly extended version of crash(8) utility in case of any kernel/process crashes as well as debug live kernel itself on a jetson. Since we have been traditionally shipping crash(8) with our kernels, we expect to ship it with our next release on TX2/TX2i as well.
Hi,
Please follow below steps to make the crash tool working in TX2 & Xavier.
Apply the upstream patch: This will fix the crash coming in both SOC’s.
[PATCH] /dev/mem: Add bounce buffer for copy-out : https://lkml.org/lkml/2017/12/1/792
Below line from patch giving compilation error can be ignored.
“+ imply STRICT_DEVMEM”
Additional step for TX2: This will solve “could not find MAGIC_START!”
In bootloader(cboot), disable CONFIG_DYNAMIC_LOAD_ADDRESS feature for dynamic kernel image load address.
File: cboot/platform/t186/l4t.mk
Change:
- CONFIG_DYNAMIC_LOAD_ADDRESS=1 (delete this line or set macro to zero for disabling feature)
I have applied the patch mentioned in the Step 1, it was clean and the kernel compiled successfully.
However, I can’t find the “File: cboot/platform/t186/l4t.mk” in the cboot_src_t19x.tbz2 tarball. This is the only cboot_src tarball present in the public sources on the download center https://developer.nvidia.com/embedded/dlc/public_sources_AGX
The l4t.mk file present in this cboot_src tarball is related to t194 (Xavier). Can you please share the cboot source dir for TX2?
Sorry for such a late reply, but I finally got a chance to look at this issue again. Looks like your first suggestion is already part of the L4T R32.3.1.
I made the second change in the cboot code, and now the crash(8) utility works as expected.
However, I am wondering if not compiling cboot with “CONFIG_DYNAMIC_LOAD_ADDRESS=1” will have any side-effects.