Failed to Load (Boot Hangs) GUI after Apt-Get Upgrade

Hi,

I encountered a GUI loading issue after running.

sudo apt-get upgrade
sudo reboot

I went through most of the related posts but couldn’t solve the issue. I am able to ssh to the device.

The display shows the system hangs at “started Update UTMP about System Runlevel Changes”

I looked the Xorg.0.log and systemctl status gdm3.service. Here are the outputs:

Xorg.0.log:

[    22.909] (--) Log file renamed from "/var/log/Xorg.pid-8510.log" to "/var/log/Xorg.0.log"
[    22.911]
X.Org X Server 1.19.6
Release Date: 2017-12-20
[    22.911] X Protocol Version 11, Revision 0
[    22.911] Build Operating System: Linux 4.15.0-163-generic aarch64 Ubuntu
[    22.911] Current Operating System: Linux 00-Raiser 4.9.140-tegra #1 SMP PREEMPT Wed Apr 8 18:15:20 PDT 2020 aarch64
[    22.911] Kernel command line: root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 video=tegrafb no_console_suspend=1 earlycon=tegra_comb_uart,mmio32,0x0c168000 gpt usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=8 boot.slot_suffix= boot.ratchetvalues=0.4.2 vpr_resize sdhci_tegra.en_boot_part_access=1
[    22.911] Build Date: 14 December 2021  04:14:51PM
[    22.911] xorg-server 2:1.19.6-1ubuntu4.10 (For technical support please see http://www.ubuntu.com/support)
[    22.911] Current version of pixman: 0.34.0
[    22.911]    Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
[    22.911] Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[    22.912] (==) Log file: "/var/log/Xorg.0.log", Time: Tue Feb  8 14:43:48 2022
[    22.912] (==) Using config file: "/etc/X11/xorg.conf"
[    22.912] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[    22.913] (==) No Layout section.  Using the first Screen section.
[    22.913] (**) |-->Screen "Default Screen" (0)
[    22.913] (**) |   |-->Monitor "<default monitor>"
[    22.916] (**) |   |-->Device "Tegra0"
[    22.916] (==) No monitor specified for screen "Default Screen".
        Using a default monitor configuration.
[    22.916] (==) Automatically adding devices
[    22.916] (==) Automatically enabling devices
[    22.916] (==) Automatically adding GPU devices
[    22.916] (==) Automatically binding GPU devices
[    22.916] (==) Max clients allowed: 256, resource mask: 0x1fffff
[    22.916] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist.
[    22.916]    Entry deleted from font path.
[    22.916] (WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist.
[    22.916]    Entry deleted from font path.
[    22.916] (WW) The directory "/usr/share/fonts/X11/100dpi" does not exist.
[    22.916]    Entry deleted from font path.
[    22.916] (WW) The directory "/usr/share/fonts/X11/75dpi" does not exist.
[    22.916]    Entry deleted from font path.
[    22.917] (==) FontPath set to:
        /usr/share/fonts/X11/misc,
        /usr/share/fonts/X11/cyrillic,
        /usr/share/fonts/X11/Type1,
        built-ins
[    22.917] (==) ModulePath set to "/usr/lib/xorg/modules"
[    22.917] (II) The server relies on udev to provide the list of input devices.
        If no devices become available, reconfigure udev or disable AutoAddDevices.
[    22.917] (II) Loader magic: 0x5558562010
[    22.917] (II) Module ABI versions:
[    22.917]    X.Org ANSI C Emulation: 0.4
[    22.917]    X.Org Video Driver: 23.0
[    22.917]    X.Org XInput driver : 24.1
[    22.917]    X.Org Server Extension : 10.0
[    22.919] (++) using VT number 1

[    22.928] (II) systemd-logind: took control of session /org/freedesktop/login1/session/c2
[    22.932] (II) no primary bus or device found
[    22.932] (WW) "dri" will not be loaded unless you've specified it to be loaded elsewhere.
[    22.932] (II) "glx" will be loaded by default.
[    22.932] (II) LoadModule: "extmod"
[    22.933] (II) Module "extmod" already built-in
[    22.933] (II) LoadModule: "glx"
[    22.933] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[    22.936] (II) Module glx: vendor="X.Org Foundation"
[    22.936]    compiled for 1.19.6, module version = 1.0.0
[    22.936]    ABI class: X.Org Server Extension, version 10.0
[    22.936] (II) LoadModule: "nvidia"
[    22.937] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[    22.938] (II) Module nvidia: vendor="NVIDIA Corporation"
[    22.938]    compiled for 4.0.2, module version = 1.0.0
[    22.938]    Module class: X.Org Video Driver
[    22.938] (II) NVIDIA dlloader X Driver  32.4.4  Release Build  (integ_stage_rel)  (buildbrain@mobile-u64-4294)  Tue Oct 27 21:02:20 PDT 2020
[    22.938] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[    22.939] (WW) Falling back to old probe method for NVIDIA
[    22.939] (II) Loading sub module "fb"
[    22.939] (II) LoadModule: "fb"
[    22.939] (II) Loading /usr/lib/xorg/modules/libfb.so
[    22.940] (II) Module fb: vendor="X.Org Foundation"
[    22.940]    compiled for 1.19.6, module version = 1.0.0
[    22.940]    ABI class: X.Org ANSI C Emulation, version 0.4
[    22.940] (II) Loading sub module "wfb"
[    22.940] (II) LoadModule: "wfb"
[    22.940] (II) Loading /usr/lib/xorg/modules/libwfb.so
[    22.941] (II) Module wfb: vendor="X.Org Foundation"
[    22.941]    compiled for 1.19.6, module version = 1.0.0
[    22.941]    ABI class: X.Org ANSI C Emulation, version 0.4
[    22.941] (II) Loading sub module "ramdac"
[    22.941] (II) LoadModule: "ramdac"
[    22.941] (II) Module "ramdac" already built-in
[    22.943] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[    22.943] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
[    22.943] (==) NVIDIA(0): RGB weight 888
[    22.943] (==) NVIDIA(0): Default visual is TrueColor
[    22.943] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[    22.943] (DB) xf86MergeOutputClassOptions unsupported bus type 0
[    22.943] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration" "true"
[    22.944] (**) NVIDIA(0): Enabling 2D acceleration
[    22.944] (II) Loading sub module "glxserver_nvidia"
[    22.944] (II) LoadModule: "glxserver_nvidia"
[    22.944] (II) Loading /usr/lib/xorg/modules/extensions/libglxserver_nvidia.so
[    22.953] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[    22.953]    compiled for 4.0.2, module version = 1.0.0
[    22.954]    Module class: X.Org Server Extension
[    22.954] (II) NVIDIA GLX Module  32.4.4  Release Build  (integ_stage_rel)  (buildbrain@mobile-u64-4294)  Tue Oct 27 21:00:13 PDT 2020
[    22.958] (--) NVIDIA(0): Valid display device(s) on GPU-0 at SoC
[    22.958] (--) NVIDIA(0):     DFP-0
[    22.958] (--) NVIDIA(0):     DFP-1
[    22.958] (--) NVIDIA(0):     DFP-2
[    22.958] (II) NVIDIA(0): NVIDIA GPU Xavier (nvgpu) (GV11B) at SoC (GPU-0)
[    22.958] (--) NVIDIA(0): Memory: 32692996 kBytes
[    22.958] (--) NVIDIA(0): VideoBIOS:
[    22.959] (==) NVIDIA(0):
[    22.959] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[    22.959] (==) NVIDIA(0):     will be used as the requested mode.
[    22.959] (==) NVIDIA(0):
[    22.959] (--) NVIDIA(0): No enabled display devices found; starting anyway because
[    22.959] (--) NVIDIA(0):     AllowEmptyInitialConfiguration is enabled
[    22.959] (II) NVIDIA(0): Validated MetaModes:
[    22.959] (II) NVIDIA(0):     "NULL"
[    22.959] (**) NVIDIA(0): Virtual screen size configured to be 1280 x 720
[    22.959] (WW) NVIDIA(0): Unable to get display device for DPI computation.
[    22.959] (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default
[    22.959] (--) Depth 24 pixmap format is 32 bpp
[    22.960] (II) NVIDIA: Reserving 24576.00 MB of virtual memory for indirect memory
[    22.960] (II) NVIDIA:     access.
[    22.966] (EE) NVIDIA(0): Failed to allocate NVIDIA Error Handler
[    22.967] (II) NVIDIA(0): ACPI: failed to connect to the ACPI event daemon; the daemon
[    22.967] (II) NVIDIA(0):     may not be running or the "AcpidSocketPath" X
[    22.967] (II) NVIDIA(0):     configuration option may not be set correctly.  When the
[    22.967] (II) NVIDIA(0):     ACPI event daemon is available, the NVIDIA X driver will
[    22.967] (II) NVIDIA(0):     try to use it to receive ACPI event notifications.  For
[    22.967] (II) NVIDIA(0):     details, please see the "ConnectToAcpid" and
[    22.967] (II) NVIDIA(0):     "AcpidSocketPath" X configuration options in Appendix B: X
[    22.967] (II) NVIDIA(0):     Config Options in the README.

systemctl status gdm3.service:

Feb 08 14:43:38 00-Raiser systemd[1]: Starting GNOME Display Manager...
Feb 08 14:43:38 00-Raiser systemd[1]: Started GNOME Display Manager.
Feb 08 14:43:39 00-Raiser gdm-autologin][7401]: AccountsService: ActUserManager: user (null) has no username (object path: /org/freedesktop/Accounts/User1000, uid:
Feb 08 14:43:39 00-Raiser gdm-autologin][7401]: gkr-pam: no password is available for user
Feb 08 14:43:39 00-Raiser gdm-autologin][7401]: pam_unix(gdm-autologin:session): session opened for user archer by (uid=0)
Feb 08 14:43:44 00-Raiser gdm-launch-environment][8154]: pam_unix(gdm-launch-environment:session): session opened for user gdm by (uid=0)
Feb 08 14:43:48 00-Raiser gdm3[7184]: Child process -8430 was already dead.
Feb 08 14:43:48 00-Raiser gdm3[7184]: Child process 8154 was already dead.
Feb 08 14:43:48 00-Raiser gdm3[7184]: Unable to kill session worker process
Feb 08 14:43:48 00-Raiser gdm-launch-environment][8489]: pam_unix(gdm-launch-environment:session): session opened for user gdm by (uid=0)

The log file seems unfinished as it ends at “[ 22.967] (II) NVIDIA(0): Config Options in the README.”.

I am a little lost from here.

Any help and comments would be appreciated!

Archer

Hi @archer.tavish.c ,

Could you describe about your environment?

  1. Is it a jetson devkit or some custom board?

  2. What was the release version before running apt-get upgrade?

  3. If you reboot the device, will the xorg log got refreshed?

Hi, WayneWWW,

Thank you for your response.

  1. My board is a Xavier AGX devkit.

  2. The release before the upgrade was # R32 (release), REVISION: 4.3, GCID: 21589087, BOARD: t186ref, EABI: aarch64

  3. The xorg log is refreshed every time but ends at the same line.

I looked at the kernel log. The nvgpu might be the reason?

Feb  9 16:54:50 00-Raiser kernel: [   20.671938] nvgpu: 17000000.gv11b        gk20a_gr_handle_fecs_error:5281 [ERR]  fecs watchdog triggered for channel 511, cannot ctxsw anymore !!
Feb  9 16:54:50 00-Raiser kernel: [   20.672199] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:129  [ERR]  gr_fecs_os_r : 0
Feb  9 16:54:50 00-Raiser kernel: [   20.672366] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:131  [ERR]  gr_fecs_cpuctl_r : 0x40
Feb  9 16:54:50 00-Raiser kernel: [   20.672525] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:133  [ERR]  gr_fecs_idlestate_r : 0x1
Feb  9 16:54:50 00-Raiser kernel: [   20.672687] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:135  [ERR]  gr_fecs_mailbox0_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.672847] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:137  [ERR]  gr_fecs_mailbox1_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.673008] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:139  [ERR]  gr_fecs_irqstat_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.673236] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:141  [ERR]  gr_fecs_irqmode_r : 0x4
Feb  9 16:54:50 00-Raiser kernel: [   20.673974] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:143  [ERR]  gr_fecs_irqmask_r : 0x8705
Feb  9 16:54:50 00-Raiser kernel: [   20.674698] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:145  [ERR]  gr_fecs_irqdest_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.678544] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:147  [ERR]  gr_fecs_debug1_r : 0x40
Feb  9 16:54:50 00-Raiser kernel: [   20.687704] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:149  [ERR]  gr_fecs_debuginfo_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.697353] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:151  [ERR]  gr_fecs_ctxsw_status_1_r : 0x980
Feb  9 16:54:50 00-Raiser kernel: [   20.707481] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(0) : 0x1
Feb  9 16:54:50 00-Raiser kernel: [   20.717709] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(1) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.727707] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(2) : 0x90009
Feb  9 16:54:50 00-Raiser kernel: [   20.738497] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(3) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.748968] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(4) : 0x1ffda0
Feb  9 16:54:50 00-Raiser kernel: [   20.759710] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(5) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.769576] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(6) : 0x15
Feb  9 16:54:50 00-Raiser kernel: [   20.780108] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(7) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.790598] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(8) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.800611] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(9) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.810898] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(10) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.821401] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(11) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.831543] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(12) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.842356] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(13) : 0x3fffffff
Feb  9 16:54:50 00-Raiser kernel: [   20.852928] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(14) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.863699] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(15) : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.873879] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:159  [ERR]  gr_fecs_engctl_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.883593] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:161  [ERR]  gr_fecs_curctx_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.892916] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:163  [ERR]  gr_fecs_nxtctx_r : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.902212] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:169  [ERR]  FECS_FALCON_REG_IMB : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.911757] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:175  [ERR]  FECS_FALCON_REG_DMB : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.921056] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:181  [ERR]  FECS_FALCON_REG_CSW : 0x110800
Feb  9 16:54:50 00-Raiser kernel: [   20.931405] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:187  [ERR]  FECS_FALCON_REG_CTX : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.940955] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:193  [ERR]  FECS_FALCON_REG_EXCI : 0x0
Feb  9 16:54:50 00-Raiser kernel: [   20.950299] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x62
Feb  9 16:54:50 00-Raiser kernel: [   20.960168] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f48
Feb  9 16:54:50 00-Raiser kernel: [   20.969781] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x51c0
Feb  9 16:54:50 00-Raiser kernel: [   20.979944] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f44
Feb  9 16:54:50 00-Raiser kernel: [   20.989572] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x5f
Feb  9 16:54:50 00-Raiser kernel: [   20.999359] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f44
Feb  9 16:54:50 00-Raiser kernel: [   21.008994] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x62
Feb  9 16:54:50 00-Raiser kernel: [   21.018731] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f44
Feb  9 16:54:52 00-Raiser kernel: [   23.045235] nvgpu: 17000000.gv11b   nvgpu_set_error_notifier_locked:137  [ERR]  error notifier set to 8 for ch 511
Feb  9 16:54:52 00-Raiser kernel: [   23.045482] nvgpu: 17000000.gv11b   gv11b_fifo_handle_ctxsw_timeout:1611 [ERR]  ctxsw timeout error: active engine id =0, tsg=0, info: awaiting ack ms=3100

Edited:
I read your response in other posts saying this is a known problem and shouldn’t cause any gpu issues, so it is not the case then.

Hi,

No, this is not a known issue. What you saw is probably for jetson nano.

Please understand that if the SoC is different, then the issue is different. Your nvgpu error is not a known issue.

Is reflashing the board an option to you?

Also, what is the target BSP you are upgrading with apt-get upgrade? rel-32.5 or rel-32.6? or you don’t know the BSP will be upgraded?

Hi WayneWWW

I have some good updates. First, to answer your question:

Right now, the nvidia-l4t-core, after the update, shows 32.4.4-20201016123640.

I have successfully brought back the GUI and can see the login screen now. I reinstalled most of the UI-related packages and restarted the gdm3. However, I am in a login loop now and still couldn’t see the desktop. Working on it right now.

Archer

I am actually able to login but only with the Unity or GNOMEclassic option. The login loop persists when I pick other two options (GNOME on Xorg and ubuntu) (Picture attached below). I think I don’t need to re-flash since the unity one works as before, but I have some following questions:

  1. Is it possible to bypass this screen? My remote access can only establish connection when I login with Unity option.

  2. I found some posts saying GNOME related to ubuntu desktop. Is it safe to remove GNOME (i.e. remove the other 3 login options) if I just want to use the old unity desktop?

To anyone interested in this problem:

I logged into the Unity desktop and disabled the automatic login then re-enabled it. I can now skip the login screen. I may leave the GNOME components for now just to not cause additional issues.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.