TX2 booting is stucked

Hello,
Recently, we flashed 50ea of TX2 Modules.
The one of them has a trouble.

It is fine with re-flash at the first time, but it face with same trouble as soon.
I try it several times and every result is same.

Any other messages before these messages are fine as same as normal module.
So, I think these are issue point.

<hit enter to activate fiq debugger>
[    1.340749] tegra-asoc: sound: Can't retrieve clk ahub
[    3.612615] cgroup: cgroup2: unknown option "nsdelegate"
[    4.701386] using random self ethernet address
[    4.734333] using random host ethernet address
[    5.309927] using random self ethernet address
[    5.318221] using random host ethernet address
[    5.668009] tegra-i2c 3180000.i2c: no acknowledge from address 0x50
[    5.674443] ov5693 2-0036: Error -121 reading eeprom
[    5.683856] ov5693 2-0036: board setup failed
[    5.690651] ov5693: probe of 2-0036 failed with error -121
[    7.733271] vdd-1v8: voltage operation not allowed
[    7.733290] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
[    7.753138] vdd-1v8: voltage operation not allowed
[    7.753162] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
[    7.774016] vdd-1v8: voltage operation not allowed
[    7.774037] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
[    7.776556] vdd-1v8: voltage operation not allowed
[    7.776571] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
[    7.888667] vdd-1v8: voltage operation not allowed
[    7.888688] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
[    7.889710] vdd-1v8: voltage operation not allowed
[    7.889729] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
[    7.891381] vdd-1v8: voltage operation not allowed
[    7.891398] sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)

And I checked some i2c commands.
these commands are fine.

i2cdetect -l
i2cdump -f -y 7 0x50

But those are getting ‘no acknowledge’ as like follows.

i2cdetect -r -y  1
=> [ 2594.430618] tegra-i2c c240000.i2c: no acknowledge from address 0x5
i2cdetect -r -y  7     
=> [  997.856591] tegra-i2c c250000.i2c: no acknowledge from address 0x4

Please, check it.
boot log => error_module.txt (23.7 KB)

Is that “tegra-i2c 3180000.i2c: no acknowledge from address 0x50” the only error you want to ask here?

I mean do you have other functionality problem?

Is it a NV developer kit?

Added question to the list: Have you tried that module on a different carrier board, and have you tried flashing a module which you know works on the carrier board of the one currently failing? I’m wondering if the error tracks the module versus tracking the carrier board.

Dear @linuxdev
It has same problem on custom board and dev kit board.
I’ve checked boot log on only dev kit board.
So, I’m not sure boot logs are same between custom board and dev kit board.
But symptom are same.

Anyway, we have lots of custom board and modules.
We did cross check and this module only has a problem.

Dear @WayneWWW

Debug UART is logged on, but Monitor has stuck on booting step.
Display blinks with these line prints.

vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)
vdd-1v8: voltage operation not allowed
sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)

Can you share the full log you see on the uart console?

Dear @WayneWWW

You can download the end of the first post.

Please remove the “quiet” in your /boot/extlinux/extlinux.conf and dump the uart log again.

Dear @WayneWWW
This is the full log without ‘quite’
error_module_without_quite.txt (81.1 KB)

Do you keep seeing this log bumping out and also with display blinks?

sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)

This log only appears in the timestamp 12. Does it keep printing even after this timestamp?

Also, could you hotplug your HDMI cable and share me the dmesg again?

Dear, @WayneWWW .

This line prints only 7~8 times .
And there is no more print it, just same image blinks.

sdhci-tegra 3440000.sdhci: could not set regulator OCR (-1)

Now, this image blinks continuously without quite.
The red box area blinks with quite.

And I did remove and equip twice after boot-up.
This is the dmesg => dmesg.txt (122.0 KB)

Could you share /var/log/Xorg.0.log here?

Dear @WayneWWW

Here it is.

[    78.764] (--) Log file renamed from "/var/log/Xorg.pid-6934.log" to "/var/loo
g/Xorg.0.log"
[    78.764]
X.Org X Server 1.19.6
Release Date: 2017-12-20
[    78.764] X Protocol Version 11, Revision 0
[    78.764] Build Operating System: Linux 4.4.0-148-generic aarch64 Ubuntu
[    78.764] Current Operating System: Linux superbin-desktop 4.9.140-tegra #1 SS
MP PREEMPT Mon Dec 9 22:52:02 PST 2019 aarch64
[    78.764] Kernel command line: console=ttyS0,115200 androidboot.presilicon=trr
ue firmware_class.path=/etc/firmware root=/dev/mmcblk0p1 rw rootwait rootfstype==
ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 isolcpus=1-2
 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x3100000 nvdumper__
reserved=0x2772e0000 gpt rootfs.slot_suffix= tegra_fbmem2=0x140000@0x96081000 luu
t_mem2=0x2008@0x9607e000 usbcore.old_scheme_first=1 tegraid=18.1.2.0.0 maxcpus=66
 boot.slot_suffix= boot.ratchetvalues=0.2031647.1 vpr_resize bl_prof_dataptr=0x11
0000@0x275840000 sdhci_tegra.en_boot_part_access=1
[    78.765] Build Date: 03 June 2019  08:11:53AM
[    78.765] xorg-server 2:1.19.6-1ubuntu4.3 (For technical support please see hh
ttp://www.ubuntu.com/support)
[    78.765] Current version of pixman: 0.34.0
[    78.765]    Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.

Are you sure you attached the full log?

FYI, that log is so short the X server never even tried to load any of the GUI drivers. The server did not fail so much as it stopped before it even tried to do anything interesting. That would be very surprising unless the content has been vastly customized.

Dear @WayneWWW , @linuxdev

Sorry, there is a mistake.
It looks full log.


superbin@superbin-desktop:~$ sudo cat /var/log/Xorg.0.log
[sudo] password for superbin:
Sorry, try again.
[sudo] password for superbin:
[   119.843] (--) Log file renamed from "/var/log/Xorg.pid-7609.log" to "/var/log/Xorg.0.log"
[   119.844]
X.Org X Server 1.19.6
Release Date: 2017-12-20
[   119.844] X Protocol Version 11, Revision 0
[   119.844] Build Operating System: Linux 4.4.0-148-generic aarch64 Ubuntu
[   119.844] Current Operating System: Linux superbin-desktop 4.9.140-tegra #1 SMP PREEMPT Mon Dec 9 22:52:02 PST 2019 aarch64
[   119.844] Kernel command line: console=ttyS0,115200 androidboot.presilicon=true firmware_class.path=/etc/firmware root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 isolcpus=1-2  video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x3100000 nvdumper_reserved=0x2772e0000 gpt rootfs.slot_suffix= tegra_fbmem2=0x140000@0x96081000 lut_mem2=0x2008@0x9607e000 usbcore.old_scheme_first=1 tegraid=18.1.2.0.0 maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.2031647.1 vpr_resize bl_prof_dataptr=0x10000@0x275840000 sdhci_tegra.en_boot_part_access=1
[   119.844] Build Date: 03 June 2019  08:11:53AM
[   119.844] xorg-server 2:1.19.6-1ubuntu4.3 (For technical support please see http://www.ubuntu.com/support)
[   119.844] Current version of pixman: 0.34.0
[   119.844]    Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
[   119.844] Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   119.845] (==) Log file: "/var/log/Xorg.0.log", Time: Mon Oct 18 13:07:33 2021
[   119.845] (==) Using config file: "/etc/X11/xorg.conf"
[   119.845] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   119.845] (==) No Layout section.  Using the first Screen section.
[   119.845] (==) No screen section available. Using defaults.
[   119.845] (**) |-->Screen "Default Screen Section" (0)
[   119.845] (**) |   |-->Monitor "<default monitor>"
[   119.846] (==) No device specified for screen "Default Screen Section".
        Using the first device section listed.
[   119.846] (**) |   |-->Device "Tegra0"
[   119.846] (==) No monitor specified for screen "Default Screen Section".
        Using a default monitor configuration.
[   119.846] (==) Automatically adding devices
[   119.846] (==) Automatically enabling devices
[   119.846] (==) Automatically adding GPU devices
[   119.846] (==) Automatically binding GPU devices
[   119.846] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   119.846] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[   119.846]    Entry deleted from font path.
[   119.846] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist.
[   119.846]    Entry deleted from font path.
[   119.846] (WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist.
[   119.846]    Entry deleted from font path.
[   119.846] (WW) The directory "/usr/share/fonts/X11/100dpi" does not exist.
[   119.846]    Entry deleted from font path.
[   119.846] (WW) The directory "/usr/share/fonts/X11/75dpi" does not exist.
[   119.846]    Entry deleted from font path.
[   119.846] (==) FontPath set to:
        /usr/share/fonts/X11/misc,
        /usr/share/fonts/X11/Type1,
        built-ins
[   119.846] (==) ModulePath set to "/usr/lib/xorg/modules"
[   119.846] (II) The server relies on udev to provide the list of input devices.
        If no devices become available, reconfigure udev or disable AutoAddDevices.
[   119.846] (II) Loader magic: 0x557dcb2010
[   119.846] (II) Module ABI versions:
[   119.846]    X.Org ANSI C Emulation: 0.4
[   119.846]    X.Org Video Driver: 23.0
[   119.846]    X.Org XInput driver : 24.1
[   119.846]    X.Org Server Extension : 10.0
[   119.847] (++) using VT number 1

[   119.852] (II) systemd-logind: took control of session /org/freedesktop/login1/session/c29
[   119.853] (II) no primary bus or device found
[   119.853] (WW) "dri" will not be loaded unless you've specified it to be loaded elsewhere.
[   119.853] (II) "glx" will be loaded by default.
[   119.853] (II) LoadModule: "extmod"
[   119.853] (II) Module "extmod" already built-in
[   119.853] (II) LoadModule: "glx"
[   119.853] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[   119.856] (II) Module glx: vendor="X.Org Foundation"
[   119.856]    compiled for 1.19.6, module version = 1.0.0
[   119.856]    ABI class: X.Org Server Extension, version 10.0
[   119.856] (II) LoadModule: "nvidia"
[   119.856] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[   119.857] (II) Module nvidia: vendor="NVIDIA Corporation"
[   119.857]    compiled for 4.0.2, module version = 1.0.0
[   119.857]    Module class: X.Org Video Driver
[   119.857] (II) NVIDIA dlloader X Driver  32.3.1  Release Build  (integ_stage_rel)  (buildbrain@mobile-u64-1935)  Mon Dec  9 22:52:33 PST 2019
[   119.857] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[   119.857] (WW) Falling back to old probe method for NVIDIA
[   119.857] (II) Loading sub module "fb"
[   119.857] (II) LoadModule: "fb"
[   119.858] (II) Loading /usr/lib/xorg/modules/libfb.so
[   119.858] (II) Module fb: vendor="X.Org Foundation"
[   119.858]    compiled for 1.19.6, module version = 1.0.0
[   119.858]    ABI class: X.Org ANSI C Emulation, version 0.4
[   119.858] (II) Loading sub module "wfb"
[   119.858] (II) LoadModule: "wfb"
[   119.858] (II) Loading /usr/lib/xorg/modules/libwfb.so
[   119.858] (II) Module wfb: vendor="X.Org Foundation"
[   119.858]    compiled for 1.19.6, module version = 1.0.0
[   119.858]    ABI class: X.Org ANSI C Emulation, version 0.4
[   119.858] (II) Loading sub module "ramdac"
[   119.859] (II) LoadModule: "ramdac"
[   119.859] (II) Module "ramdac" already built-in
[   119.859] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[   119.859] (II) NVIDIA(0): Creating default Display subsection in Screen section
        "Default Screen Section" for depth/fbbpp 24/32
[   119.859] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
[   119.859] (==) NVIDIA(0): RGB weight 888
[   119.859] (==) NVIDIA(0): Default visual is TrueColor
[   119.860] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[   119.860] (DB) xf86MergeOutputClassOptions unsupported bus type 0
[   119.860] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration" "true"
[   119.860] (**) NVIDIA(0): Enabling 2D acceleration
[   119.860] (II) Loading sub module "glxserver_nvidia"
[   119.860] (II) LoadModule: "glxserver_nvidia"
[   119.860] (II) Loading /usr/lib/xorg/modules/extensions/libglxserver_nvidia.so
[   119.869] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[   119.869]    compiled for 4.0.2, module version = 1.0.0
[   119.869]    Module class: X.Org Server Extension
[   119.869] (II) NVIDIA GLX Module  32.3.1  Release Build  (integ_stage_rel)  (buildbrain@mobile-u64-1935)  Mon Dec  9 22:49:43 PST 2019
[   119.872] (--) NVIDIA(0): Valid display device(s) on GPU-0 at SoC
[   119.872] (--) NVIDIA(0):     DFP-0
[   119.872] (II) NVIDIA(0): NVIDIA GPU NVIDIA Tegra X2 (nvgpu) (GP10B) at SoC (GPU-0)
[   119.872] (--) NVIDIA(0): Memory: 8045912 kBytes
[   119.873] (--) NVIDIA(0): VideoBIOS:
[   119.873] (--) NVIDIA(GPU-0): Samsung S24C230 (DFP-0): connected
[   119.873] (--) NVIDIA(GPU-0): Samsung S24C230 (DFP-0): External TMDS
[   119.873] (==) NVIDIA(0):
[   119.873] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[   119.873] (==) NVIDIA(0):     will be used as the requested mode.
[   119.873] (==) NVIDIA(0):
[   119.873] (II) NVIDIA(0): Validated MetaModes:
[   119.873] (II) NVIDIA(0):     "DFP-0:nvidia-auto-select"
[   119.873] (II) NVIDIA(0): Virtual screen size determined to be 1920 x 1080
[   119.875] (--) NVIDIA(0): DPI set to (93, 94); computed from "UseEdidDpi" X config
[   119.875] (--) NVIDIA(0):     option
[   119.875] (--) Depth 24 pixmap format is 32 bpp
[   119.875] (II) NVIDIA: Reserving 24576.00 MB of virtual memory for indirect memory
[   119.875] (II) NVIDIA:     access.
[   119.878] (EE) NVIDIA(0): Failed to allocate NVIDIA Error Handler
[   119.878] (II) NVIDIA(0): ACPI: failed to connect to the ACPI event daemon; the daemon
[   119.878] (II) NVIDIA(0):     may not be running or the "AcpidSocketPath" X
[   119.878] (II) NVIDIA(0):     configuration option may not be set correctly.  When the
[   119.878] (II) NVIDIA(0):     ACPI event daemon is available, the NVIDIA X driver will
[   119.878] (II) NVIDIA(0):     try to use it to receive ACPI event notifications.  For
[   119.878] (II) NVIDIA(0):     details, please see the "ConnectToAcpid" and
[   119.878] (II) NVIDIA(0):     "AcpidSocketPath" X configuration options in Appendix B: X
[   119.878] (II) NVIDIA(0):     Config Options in the README.
[   119.912] (II) NVIDIA(0): Setting mode "DFP-0:nvidia-auto-select"
[   120.050] (==) NVIDIA(0): Disabling shared memory pixmaps
[   120.050] (==) NVIDIA(0): Backing store enabled
[   120.050] (==) NVIDIA(0): Silken mouse enabled
[   120.050] (==) NVIDIA(0): DPMS enabled
[   120.051] (II) Loading sub module "dri2"
[   120.051] (II) LoadModule: "dri2"
[   120.051] (II) Module "dri2" already built-in
[   120.051] (II) NVIDIA(0): [DRI2] Setup complete
[   120.051] (II) NVIDIA(0): [DRI2]   VDPAU driver: nvidia
[   120.054] (--) RandR disabled
[   120.059] (II) SELinux: Disabled on system
[   120.060] (II) Initializing extension GLX
[   120.060] (II) Indirect GLX disabled.
[   120.104] (EE) Error compiling keymap (server-0) executing '"/usr/bin/xkbcomp" -w 1 "-R/usr/share/X11/xkb" -xkm "-" -em1 "The XKEYBOARD keymap compiler (xkbcomp) reports:" -emp "> " -eml "Errors from xkbcomp are not fatal to the X server" "/tmp/server-0.xkm"'
[   120.104] (EE) XKB: Couldn't compile keymap
[   120.104] (EE) XKB: Failed to load keymap. Loading default keymap instead.
[   120.134] (EE) Error compiling keymap (server-0) executing '"/usr/bin/xkbcomp" -w 1 "-R/usr/share/X11/xkb" -xkm "-" -em1 "The XKEYBOARD keymap compiler (xkbcomp) reports:" -emp "> " -eml "Errors from xkbcomp are not fatal to the X server" "/tmp/server-0.xkm"'
[   120.134] (EE) XKB: Couldn't compile keymap
[   120.134] XKB: Failed to compile keymap
[   120.134] Keyboard initialization failed. This could be a missing or incorrect setup of xkeyboard-config.
[   120.134] (EE)
Fatal server error:
[   120.134] (EE) Failed to activate virtual core keyboard: 2(EE)
[   120.134] (EE)
Please consult the The X.Org Foundation support
         at http://wiki.x.org
 for help.
[   120.134] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   120.134] (EE)
superbin@superbin-desktop:~$

Do you have “other” TX2 modules to compare the Xorg log? You said there are 50 pcs of TX2 on your side right?

If so, could you check the xorg log on those normal TX2 and see if they got below error too?

120.104] (EE) Error compiling keymap (server-0) executing ‘"/usr/bin/xkbcomp" -w 1 “-R/usr/share/X11/xkb” -xkm “-” -em1 “The XKEYBOARD keymap compiler (xkbcomp) reports:” -emp "> " -eml “Errors from xkbcomp are not fatal to the X server” “/tmp/server-0.xkm”’
[ 120.104] (EE) XKB: Couldn’t compile keymap
[ 120.104] (EE) XKB: Failed to load keymap. Loading default keymap instead.
[ 120.134] (EE) Error compiling keymap (server-0) executing ‘"/usr/bin/xkbcomp" -w 1 “-R/usr/share/X11/xkb” -xkm “-” -em1 “The XKEYBOARD keymap compiler (xkbcomp) reports:” -emp "> " -eml “Errors from xkbcomp are not fatal to the X server” “/tmp/server-0.xkm”’
[ 120.134] (EE) XKB: Couldn’t compile keymap
[ 120.134] XKB: Failed to compile keymap
[ 120.134] Keyboard initialization failed. This could be a missing or incorrect setup of xkeyboard-config.
[ 120.134] (EE)

Or just attach the xorg log from those normal TX2 so that we can do the comparison.

I am also curious if any of the Xorg server or related drivers might be known to be modified in any way? I ask because there is an ABI which the different modular components must be compiled against before that module can load. For example, typically the video display will continuously loop and fail if the GPU binary driver is not an exact match to the expected ABI of the Xorg module loader. The input module for the keyboard is similar, and if for some reason the wrong binary for the keyboard is used, then the keyboard won’t be able to load.

Dear @WayneWWW , @linuxdev

I’ve found the reason.
The storage was full like that.

Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/mmcblk0p1  28768292 27284456         0 100% /
none             3639120        0   3639120   0% /dev
tmpfs            4022956        4   4022952   1% /dev/shm
tmpfs            4022956    27608   3995348   1% /run
tmpfs               5120        4      5116   1% /run/lock
tmpfs            4022956        0   4022956   0% /sys/fs/cgroup
tmpfs             804588        0    804588   0% /run/user/1000
tmpfs             804588        4    804584   1% /run/user/120
superbin@superbin-desktop:~$

Thanks for all answers.

1 Like

That would do it! If you are interested, and have time, you could clone the system, edit the clone, and then flash again using the clone instead of generating a new filesystem. Or, since you seem to still have login ability, just remove some content.

2 Likes