Can't shutdown system via power button after several times power cycling

Hello,

Currently we are facing a problem that we can’t shutdown Ubuntu system via power button after several times power cycling with our carrier board, below are our reproduce procedures.

1.Create a script to record number of ON times and launch a terminal to show its number on login through below method

(1) Press the Super key (windows key).
(2) Type "Startup Applications"
(3) Click on the Startup Applications option
(4) Click "Add"
(5) In the "name" field, type "on off test"
(6) In the "command" field, type gnome-terminal --geometry=80x24+100+100 -e "<ON_OFF_SCRIPT_PATH>/on_off.sh"
(7) Click "Add"

2.Enable automatic login
3.Issue below command to change power button behavior from ‘suspend’ to ‘shutdown’

$ gsettings set org.gnome.settings-daemon.plugins.power button-power 'shutdown'

4.Shutdown device
=> Start to do on off test right here
5. Press power button to power on device, after login into Ubuntu desktop, you’ll see gnome-terminal is launched and show the ON times
6. Wait around 1 minutes, press power button to shutdown system
7. Repeat power on/off via power button.

After repeating several times, we can’t shutdown Ubuntu system via power button but at that time systemd-logind still can detect that power key is pressed.

[Note]: Even we skip step (3) to keep power button its default behavior (i.e. “suspend”), issue is still there.

[Read from journal log]

...
systemd-logind[4694]: Power key pressed.
...

Below are the procedures what I did to build the image based on Jetpack 4.5.1

  1. Download “Tegra186_Linux_R32.5.1_aarch64.tbz2” & “Tegra_Linux_Sample-Root-Filesystem_R32.5.1_aarch64.tbz2” and de-compress them
$ sudo tar xvf Tegra186_Linux_R32.5.1_aarch64.tbz2 Linux_for_Tegra/
$ sudo tar xvf Tegra_Linux_Sample-Root-Filesystem_R32.5.1_aarch64.tbz2 -C ./Linux_for_Tegra/rootfs
  1. Change directory to Linux_for_Tegra and issue below command
$ sudo apply_binaries.sh
  1. Build our customize kernel and then copy kernel image and modules to target rootfs. Below are our changes:

1. [Add “INSTALL_MOD_STRIP=1” while doing module installation]

make ARCH=arm64 O=${TEGRA_KERNEL_OUT} modules_install INSTALL_MOD_STRIP=1 INSTALL_MOD_PATH=${TEGRA_MODULES_OUT}

2. [Makefile change]: Change form “–strip-debug” to "–strip-unneeded"

ifeq ($(INSTALL_MOD_STRIP),1)
mod_strip_cmd = $(STRIP) --strip-unneeded
else
mod_strip_cmd = $(STRIP) $(INSTALL_MOD_STRIP)
endif # INSTALL_MOD_STRIP=1

  1. Flash image

Currently I have no idea how to solve it. Is there any problem to build the image? Thanks.

hello nelly.wu,

those steps looks right, did you download r32.5.1 L4T Driver Package (BSP) Sources for building your customize kernel?
how about using software commands, are you able to power-off the system by $ sudo shutdown -h now

Hello JerryChang,

1.Yes, our customize kernel is based on it, however, I notice that the size of “nvgpu.ko” is different between “nvidia-l4t-kernel_4.9.201-tegra-32.5.1-20210219084708_arm64.deb” and our build, I’m not sure whether it is correct.

[Extract from nvidia-l4t-kernel_4.9.201-tegra-32.5.1-20210219084708_arm64.deb]
-rw-r--r-- 1 root root 2411672  2月 20  2021 nvgpu.ko

[Our build]
-rw-r--r-- 1 root root 2416280 10月 13 10:58 nvgpu.ko

2.Since we develop a fixture to trigger power button press event to do on off test, we can’t use software command instead.

  1. When issue occurs, an internal error dialog will pop-up (see below crash files).
nvidia@nvidia-desktop:~$ ls -al /var/crash/
total 5468
drwxrwsrwt  2 root   whoopsie    4096  十  13 09:49 .
drwxr-xr-x 15 root   root        4096  十  12 10:32 ..
-rw-r-----  1 nvidia whoopsie 2323239  十  13 09:53 _usr_bin_compiz.1000.crash
-rw-r-----  1 nvidia whoopsie 2922826  十  13 09:56 _usr_lib_unity-settings-daemon_unity-settings-daemon.1000.crash
-rw-r-----  1 root   whoopsie  336022  十  13 10:01 _usr_sbin_nvphsd.0.crash

Thanks.

hello nelly.wu,

could you please upload those log files as attachment for reference?

Hi JerryChang,

Please refer to attachments, thanks.

_usr_bin_compiz.1000.crash (2.2 MB)
_usr_lib_unity-settings-daemon_unity-settings-daemon.1000.crash (2.8 MB)
_usr_sbin_nvphsd.0.crash (328.1 KB)

hello nelly.wu,

it doesn’t show much clues for the root cause,
could you please enable $ dmesg --follow to catch the kernel logs while issue happened.
thanks

Hi JerryChang,

Yesterday I set up one Xavier NX Developer Kit (with Jetpack 4.5.1, use SDK Manager to flash) and our board to reproduce again, so far our board keeps running but developer kit has been stopped after running 356 times. Anyway, I attach dmesg log (with --follow enabled), crash log and syslog from developer kit as attached file for you reference first. Thanks.

dbg_log.tar.gz (24.0 MB)

hello nelly.wu,

please try to force-stop nvphs service when issue happened, i.e. $ sudo service nvphs stop.
please check whether this workaround your issue.

Hi JerryChang,

I stop it just now but issue is still there. Thanks.

hello nelly.wu,

may I also know what’s the kernel reported if you issue software commands to shutdown the system?

Hi JerryChang,

I issue the command “sudo shutdown -h now” to shutdown system, however it seems there is nothing reported by kernel. Below is what I check, is it what you need? Thanks.

[check last shutdown time]

nvidia@nvidia-desktop:~$ last -x shutdown
shutdown system down  4.9.201-tegra    Fri Oct 15 13:22 - 13:23  (00:00)
shutdown system down  4.9.201-tegra    Fri Oct 15 13:19 - 13:20  (00:00)
...
...

[check /var/log/kern.log]

Oct 15 13:20:20 nvidia-desktop kernel: [   13.140643] tegradc 15200000.nvdisplay: unblank
Oct 15 13:20:20 nvidia-desktop kernel: [   13.140658] tegradc 15210000.nvdisplay: blank - powerdown
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134208] vdd-sdmmc3-sw: disabling
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134221] vdd-1v8-sd: disabling
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134226] vdd-1v8-cvb: disabling
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134231] vdd-epb-1v0: disabling
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134235] avdd-cam-2v8: disabling
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134240] vdd-fan: disabling
Oct 15 13:20:41 nvidia-desktop kernel: [   34.134245] vdd_sys_en: disabling
Oct 15 13:23:09 nvidia-desktop kernel: [    0.000000] Booting Linux on physical CPU 0x0000
...

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

hello nelly.wu,

don’t here shows the system has rebooted?