Jetpack 4.6: Xserver does not start

I flashed jetpack 4.6 from the SDK manager onto my jetson. After flashing the device boots and shows the configuration wizard on the screen. After I go through the wizard the jetson reboots. But now the Xserver does not start anymore. I end up with a black screen which disappears after a while showing me the terminal.

the Xorg log file say:

Xorg.0.log (7.3 KB)

I did not do any custom modifications anywhere. Any idea how to fix this?

Are you able to dump the dmesg and lsmod through the console?

Here they are. lsmod shows an empty list!?

dmesg.log (176.0 KB)

lsmod.log (38 Bytes)

What do you see from the command “uname -r”? Do you see files at:
ls /lib/modules/$(uname -r)/kernel/*

The running kernel is 4.9.201-tegra however there are no corresponding modules installed (/lib/modules/4.9.201-tegra does not exist). Installed are instead the modules of 4.9.253-tegra. So it is somehow booting the wrong kernel?

Same here… exact problem you are having. I flashed from a linux host and after I went through wizard it dropped back to 4.9.201 from the jetpack 4.5.1

I have an NVMe drive in my AGX, it has the kernel and previous modules. I may try to put them in place for now and then try doing an update off of the repo if it starts correctly… i’m befuddled a bit also and curious whats going on.

Do you have a SSD on your device?

Someone has tried to modify the kernel incorrectly. This is why it fails. Only the kernel exists, but much of the surrounding support is missing. The fact is that since “uname -r” is “4.9.201-tegra”, then this is the kernel running. Someone has attempted to install 4.9.253-tegra, but only the modules were installed…the actual kernel is not running.

In some cases a kernel is picked from the initrd. Every initrd is basically an adapter for a minimal kernel putting together all that is needed to switch to the real filesystem. Each is a case by case custom consideration.

Note that if you use an NVMe, then you must customize since a default install does not know anything about the NVMe. Upgrade with an NVMe is pretty much guaranteed to break the NVMe part of it and revert to the eMMC side unless you’ve manually accounted for this.

The situation for me before flashing jetpack 4.6 is exactly as richg described: I had jetpack 4.5.1 installed and used rootonnvme to use a nvme ssd. I did not revert it before flashing jetpack 4.6 as I assumed that flashing with the sdk manager would just use the eMMC and ignore the nvme ssd. linuxdev seems to confirm this.
Note that I did not do anything manually related to the SSD after flashing jetpack 4.6. I just flashed it and after the reboot at the end of the wizard it boots 4.9.201. I cannot check which kernel is booted while the wizard is up at the moment as I do not have physical access to the machine at the weekend.

and just to make it 100% clear: I did not use OTA update to upgrade to jetpack 4.6, I flashed it from scratch using SDK manager.

1 Like

Did you atleast retrieve your compatible 4.9.201 kernel modules from your ssd and put them down to the emmc under /lib/modules? Then your system will at least boot properly…

sudo mount /dev/nvme0n1p1  /mnt
cp -r /mnt/lib/modules/4.9.201-tegra/ /lib/modules

The error message on boot is related to the prior issue, just clear the /var/crash folder (sudo rm /var/crash/*) and its gone…

The kernel size should be 34484240 for 4.9.253… The deb file for the kernel should be reinstallable with

sudo apt install --reinstall nvidia-l4t-kernel

Here is the deb package https://repo.download.nvidia.com/jetson/t194/pool/main/n/nvidia-l4t-kernel/nvidia-l4t-kernel_4.9.253-tegra-32.6.1-20210726122859_arm64.deb

According to the boot flow for the AGX in the current docs… generic-no-api_r2 we are using cboot on the agx… which supposedly uses /boot/extlinux/extlinux.conf to select which kernel to load.

Finally here is rootOnNVMe, GitHub - jetsonhacks/rootOnNVMe: Switch the rootfs to a NVMe SSD on the Jetson Xavier NX and Jetson AGX Xa its a chroot based trick… boot starts on emmc then chroots the nvme… butg it was all undone before the flash, which I did in recover mode too!!

(Disclaimer: None of the above FIXED the issue… just sharing what should have worked but didnt… solution in my next post.)

Solved…
CBoot functionality includes a default booting scan sequence. It scans bootable devices in the following order:
1.External SD card
2.External NVMe device
3.USB device
4.Internal eMMC
5.NFS device
CBoot looks for an extlinux.conf configuration file in the following directory on each bootable device (except an NFS device):
/boot/extlinux

I assume this new boot sequence took effect in Jetpack 4.6 SDK when the Xavier was reflashed?? Getting rid or renaming /boot in the root of the nvme drive should get you on track again, it will revert to the emmc, then you can clone the emmc to the nvme…

What @richg says is right. In jetpack4.6, the NVMe boot order is higher than the emmc, thus, it will have the empty lsmod case if the kernel version is mismatched.

Checking the boot log from uart may clarify that. Or you can just remove the SSD from your device and see if it will boot normally.

Renaming the /boot directory on the nvme indeed worked and now kernel 4.9.253 is running. Thanks a lot richg to figure it out!
What is your favorite way to run the system from the nvme ssd? Same as before?

It would be nice to keep the emmc setup as a base image for recovery and clone it to the nvme as before. I think it will make more sense to modify the fstab on the nvme to mount itself as / and then keep the emmc for recovery purposes back to a known state. That is what is in my head on this, but I haven’t gotten that far.

(These notes only apply to using the rootOnNVMe method I mentioned above.)

For now, the rootOnNVMe method works so long as there is no boot folder on the NVMe device… rootOnNVMe presumes the boot process begins on the emmc for it to work properly. I’ll make a pull request to the github project.

Long story short, Jetpack 4.5.1 /boot folder on NVMe isn’t used since boot flow has progressed past any need for it., Jetpack 4.6 /boot folder on NVMe breaks rootOnNVMe method, even if emmc is copied to the NVMe device.

Solution: Exclude boot folder from rsync copy of emmc → nvme found in the copy-rootfs-ssd.sh script.

#!/bin/bash
# Mount the SSD as /mnt
sudo mount /dev/nvme0n1p1 /mnt
# Copy over the rootfs from the SD card to the SSD
sudo rsync -axHAWX --numeric-ids --info=progress2 --exclude={"/dev/","/proc/","/sys/","/tmp/","/run/","/mnt/","/media/*","/lost+found","/boot"} / /mnt
# We want to keep the SSD mounted for further operations
# So we do not unmount the SSD

I mounted the nvme ssd now as / and it seems to work fine at first glance. I don’t use the setssdroot.service any longer.

For now I kept the setssdroot.service but dropped the /boot folder from the nvme like above. At one point my system wasn’t loading firmware, like tegra19x_xusb_firmware not loading, so my kb and mouse attached would not work. I knew leaving boot out of the mix it would then at least made it work as before. I didn’t attempt flashing the root= in the kernel params under /dev/proc or whatever is necessary to change that. Did you just modify fstab? When I did that it still showed the emmc mounted as root.

What I did is the following:

  1. I removed all files on the ssd partition
  2. cloned the emmc with the script from rootonnvme
  3. Installed the chroot service
  4. Tested that this system was working properly
  5. I removed the setssdroot service from the ssd
  6. I changed the fstab on the ssd to

/dev/nvme0n1p1 / ext4 defaults,errors=remount-ro,discard 0 1

  1. rebooted

Reading your post richg, I have to say that I cannot exclude atm that the emmc was mounted first and then the setssdroot service from the emmc chroots…

Curious,
I wonder if the setssdroot.conf* exists in the emmc under /etc folder.
(*0 byte file used as flag in setssdroot.service in line ConditionPathExists=/etc/setssdroot.conf)

If you remove that file and reboot are you still seeing the SSD mounted on / or is it the emmc?
if you end up back on emmc just sudo touch /etc/setssdroot.conf

It indeed exists and when I remove it mount reports:
/dev/mmcblk0p1 on / type ext4 (rw,relatime,data=ordered)
This confirms that it still chroots from the emmc. I guess the root= boot parameter would need to be changed?