Failed to install custom kernels on Xavier for JetPack 4.6

I’m trying to install my custom kernel on Xavier with JetPack 4.6. I cross-compiled the kernel and the LOCALVERSION was set as -rtg-tegra. I copied the generated image Image-rtg and Image-rtg.sig to /boot and all the kernel modules to /lib/modules/4.9.253-rtg-tegra. I replaced /boot/dtb with the files generated with the kernel. Then I modified the extlinux.conf to

TIMEOUT 30
DEFAULT primary

MENU TITLE L4T boot options

LABEL primary
      MENU LABEL primary kernel
      LINUX /boot/Image-rtg
      INITRD /boot/initrd
      APPEND ${cbootargs} quiet root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 

# When testing a custom kernel, it is recommended that you create a backup of
# the original kernel and add a new entry to this file so that the device can
# fallback to the original kernel. To do this:
#
# 1, Make a backup of the original kernel
#      sudo cp /boot/Image /boot/Image.backup
#
# 2, Copy your custom kernel into /boot/Image
#
# 3, Uncomment below menu setting lines for the original kernel
#
# 4, Reboot

LABEL backup
      MENU LABEL backup kernel
      LINUX /boot/Image
      INITRD /boot/initrd
      APPEND ${cbootargs}

But after reboot the uname -a is still showing the original kernel. Is there anything I did wrong? Thank you for the help.

hello chenjiyang1991,

I’m not sure what’s your actual use-case to modify the naming.
however, there’s default LOCALVERSION defined in the nvbuild.sh instead of parsing the environment variable settings.
for example,

function build_arm64_kernel_sources {
...
        "${MAKE_BIN}" -C "${source_dir}" ARCH=arm64 \
                LOCALVERSION="-tegra" \

Hi JerryChang,

Thank you for the reply. I think that’s the best way to distinguish and switch between the custom kernel and the original kernel? My understanding is that if I don’t change the LOCALVERSION, the custom kernel will also use modules in /lib/modules/4.9.253-tegra, which I need to overwrite every time I have a new custom kernel that uses the same LOCALVERSION.

hello chenjiyang1991,

how about revise the build script to compile/create your kernel image.

I tried. I compiled the stock source code without changing anything using the following commands:

export TOP=/home/chen/nvidia/nvidia_sdk/JetPack_4.6_Linux_JETSON_AGX_XAVIER_TARGETS/Linux_for_Tegra
export TEGRA_KERNEL_OUT=$TOP/sources/kernel/kernel-4.9-out
export CROSS_COMPILE=$HOME/l4t-gcc/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-
export LOCALVERSION=-tegra
export INSTALL_MOD_PATH=$TOP/rootfs

echo -e "\nBuilding Kernel"
make ARCH=arm64 O=$TEGRA_KERNEL_OUT tegra_defconfig
make ARCH=arm64 O=$TEGRA_KERNEL_OUT -j6

echo -e "\nReplacing Kernel Image and DTB"
cp -r $TEGRA_KERNEL_OUT/arch/arm64/boot/Image* $TOP/kernel
cp -r $TEGRA_KERNEL_OUT/arch/arm64/boot/dts/* $TOP/kernel/dtb/ 

echo -e "\nInstaling Kernel Modules"
sudo make ARCH=arm64 O=$TEGRA_KERNEL_OUT modules_install INSTALL_MOD_PATH=$INSTALL_MOD_PATH

Then I copied the generated Image, dtb, and 4.9.253-tegra to the board. I didn’t change the name of primary kernel in /boot/extlinux/extlinux.conf.

sudo cp -r 4.9.253-tegra/ /lib/modules/
sudo cp Image /boot/
sudo cp -r dtb /boot/

But after reboot the kernel is still generated at Jul 26 12:19:28, not today.

I am thinking probably this should have worked unless it is pulling from a partition (and probably that would not happen). If you interrupt boot with serial console, and manually pick the entry with the “Image-rtg”, does anything change? What does the boot log from serial console show during the phase of picking this kernel manually, and a slight bit beyond the actual kernel load?

Just want to confirm… Using any microUSB cable to connect the Xavier to a host should be able to see the ttyUSB devices? It doesn’t require any power, right? I tried all 4 microUSB cables I have but it doesn’t seem to work. But using an USB to TTL Serial Cable with Jetson Nano I can see the devices.

I was able to use the microUSB cable to get boot log eventually. It turns out, unlike Jetson Nano (or TX2), it doesn’t shown any info on the host unless it’s powered. Here is the boot log of the stock JetPack 4.6 rev3 stock_boot.log (92.3 KB). It seems like the board is looking for extlinux.conf in the nvme. However, the nvme is just plugged in and haven’t been configured since I fully flashed the device. Does the board choose nvme for boot when it sees it by default? Thank you.

Yes, though some cables are of low quality and fail in flash, for serial console this usually works even with a cheap cable. No external power required, the UART is in the Jetson itself (it’s rather convenient). Thus the AGX itself does need power on.

Yes, alternate boot devices complicate things. I don’t recall what the order is, but the bootloader will search for a usable image (and extlinux.conf) in some order, and pick the first one from that order. If your NVMe has a “/boot”, then it is reasonable to expect boot would try to read this. Does it pick the right kernel if no NVMe is connected? In such a case it would revert to the eMMC content. But you could copy these to the NVMe and perhaps it would work:

  • /boot/Image-rtg
  • /boot/initrd
  • /boot/extlinux/extlinux.conf
  • Possibly required “/lib/modules/*”.

hello chenjiyang1991,

please also check Cboot functionality for booting scan sequence, for example, [Kernel Boot Sequence Using extlinux.conf].

Hi linuxdev, JerryChang,

Thank you for all the suggestions. I tried removing the NVMe and it can boot into my custom kernel normally without any problem. However, this is not a permanent solution since I need NVMe for the extra storage. According to the Cboot doc, it will search for external NVMe before internal eMMC. But in my case, it stops once it couldn’t find the appropriate file in the NVMe and never tries to search in eMMC. Is this the expected behavior? The NVMe is already formatted and has some files.

I also tried to copy the entire /boot/ directory to NVMe. However, it always report some authentication/validation issues
nvme_7.log (94.1 KB). I just want to use NVMe for extra storage and still boots kernel from internal eMMC. Is there anything I can do to make sure Cboot continues to check in internal eMMC after failing in NVMe?

I’m using Samsung 980 Pro PCIe 4.0. Adding this because I see in JetsonHacks they suggest there might be issues with Gen4.

I believe it searches for “/boot” on the NVMe, and if you were to rename this (for temporary disable) to something like “/backup_boot”, perhaps it would boot even with the NVMe. If it does, then you know for sure you need to edit the eMMC “/boot” for any changes.

if you were to rename this (for temporary disable) to something like “ /backup_boot ”, perhaps it would boot even with the NVMe.

Unfortunately, it doesn’t work. When I started working on the kernel, the NVMe was plugged in and there isn’t a folder named \boot in the NVMe. I’m wondering if it is a bug that Cboot doesn’t look further and just stops at NVMe?

It sounds like the AGX was flashed with parameters telling it to use the external boot device. The default flash command would be something like this (which is what JetPack/SDKM would do):
sudo ./flash.sh jetson-xavier mmcblk0p1

In the above the name “jetson-xavier” specifies part of what gets copied into “Linux_for_Tegra/rootfs/” prior to creating a boot image. The boot target (the “mmcblk0p1”) gives similar information which goes into the boot content directly inside the AGX’s eMMC. A “pointer” to boot content is how I would describe the part which goes directly into the eMMC for picking boot media.

Boot will start by looking for the content at that “pointer”, but can be modified by a search order which looks for other media. I don’t know if this is actually the flashed “pointer” causing looking first at the NVMe, or if it is the environment’s search order. However, have you flashed this AGX without the NVMe, and without telling it to look for external boot media? If you flashed without any special option to search something other than mmcblk0p1 content, then it is the environment search order; otherwise it is the content which was actually flashed causing the issue.

Thank you for the response.

I’ve flashed the Xavier several times using the SDKM in the past week to reset to stock state. I’ve also downloaded the JetPack 4.6 several times to make sure it’s stock and not modified by me.

However, all the flashing was done when the NVMe was plugged in. I’ll try to flash again without NVMe installed.

Update: I tried flashing the board without NVMe. I verified I can switch to custom kernel after flashing. Then I installed NVMe and now it’s using the stock kernel again. It’s loading from partition directly after failing in the NVMe. Here’s the boot log nvme_af.log (93.8 KB).

please check developer guide for Flashing to an NVMe Drive.

In addition to the document mentioned by @JerryChang, consider that you can add another boot entry to the eMMC version of “/boot/extlinux/extlinux.conf”, and that entry can point to the NVMe. Make sure to have an alternate/extra entry instead of replacing the default entry so you have a fallback while testing. Then use serial console to pick the entry. If it works, then you could set that entry as the first/default entry.

You won’t be able to automatically detect and boot directly to the NVMe, but switch back to eMMC if NVMe is not detected. There is no true “BIOS” on embedded systems, and thus no ability to completely avoid naming an initial “pointer” to a boot device. You could flash to point first to the NVMe, but then the NVMe would always need to be present, and removal of the NVMe would cause boot failure. It is best to keep an initial flash pointing to the eMMC “/boot”, but have an entry within this point to content in the NVMe if you want both to work.

I’ve been testing both the Samsung 970 evo and 980 pro. I haven’t tried flashing to NVMe directly. If the NVMe is brand new and doesn’t have any partition yet, the boot process is not affected. When there is a partition, I tried copying the entire /boot folder to the NVMe path. With 970 evo, it works without problem for now and I can boot into my custom kernel. With 980 pro, this doesn’t work at first and Cboot doesn’t seem to pick up the correct kernel name from extlinux.conf as shown in the log_1 (30.5 KB). After copying the extlinux.conf separately, it can use the correct kernel name and boot into the custom kernel log_2 (31.4 KB). This is probably why I thought copying the entire /boot didn’t work at the beginning. I’ll investigate this further. But finally I can boot into my custom kernel with NVMe installed.

.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.