Unable to update kernel image with nv_update_engine

lukaspohl86 · December 15, 2020, 3:05pm

Hi All,
I am trying to modify a procedure that worked on L4T R32.2.1 in order to run RT patched kernel. Originally the procedure was run inside a Docker (image sourced from Real-time kernel for NVIDIA AGX Xavier | by Roman | Medium) but I am using host PC with ubuntu instead.

Original procedure:

FROM ubuntu:bionic

RUN apt update *
&& apt install -y *
aria2 *
bc *
build-essential *
git-core *
kmod *
lbzip2 *
python *
sudo *
xxd *
&& rm -rf /var/lib/apt/lists/**

Download toolchain for crosscompilation.*

RUN aria2c -x 10 https://developer.nvidia.com/embedded/dlc/l4t-gcc-7-3-1-toolchain-64-bit *
&& tar xf gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu.tar.xz *
&& rm -f gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu.tar.xz*

Download Jetson drivers and root filesystem.*

RUN aria2c -x 10 *
https://developer.nvidia.com/embedded/dlc/Jetson_Linux_R32.2.1 *
&& aria2c -x 10 *
https://developer.nvidia.com/embedded/dlc/r32-2-1_Release_v1.0/TX2-AGX/Tegra_Linux_Sample-Root-Filesystem_R32.2.1_aarch64.tbz2 *
&& tar xpf Jetson_Linux_R32.2.1_aarch64.tbz2 *
&& cd Linux_for_Tegra/rootfs *
&& tar xpf …/…/Tegra_Linux_Sample-Root-Filesystem_R32.2.1_aarch64.tbz2 *
&& rm -f Tegra_Linux_Sample-Root-Filesystem_R32.2.1_aarch64.tbz2 Jetson_Linux_R32.2.1_aarch64.tbz2*

WORKDIR /Linux_for_Tegra*

Download kernel sources.*

RUN ./source_sync.sh -k tegra-l4t-r32.2.1*
l*
ENV CROSS_COMPILE=“/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-”*
ENV KERNEL_SRC_DIR=“/Linux_for_Tegra/sources/kernel/kernel-4.9”*
ENV KERNEL_OUT_DIR=“/Linux_for_Tegra/kernel”*
ENV ROOTFS_DIR=“/Linux_for_Tegra/rootfs”*

Apply PREEMPT_RT patches*

RUN cd ${KERNEL_SRC_DIR}/scripts && ./rt-patch.sh apply-patches && cd -*

Compile kernel and install modules.*

RUN cd ${KERNEL_SRC_DIR} *
&& mkdir ${KERNEL_SRC_DIR}/build *
&& make ARCH=arm64 O=${KERNEL_SRC_DIR}/build tegra_defconfig *
&& make ARCH=arm64 O=${KERNEL_SRC_DIR}/build CROSS_COMPILE=${CROSS_COMPILE} -j4*
RUN cd ${KERNEL_SRC_DIR} *
&& make ARCH=arm64 O=${KERNEL_SRC_DIR}/build INSTALL_MOD_STRIP=1 CROSS_COMPILE=${CROSS_COMPILE} INSTALL_MOD_PATH=${ROOTFS_DIR} modules_install*

Generate image for default kernel (w/o realtime patches) first*

RUN ./apply_binaries.sh*
RUN FAB=400 BOARDID=2888 FUSELEVEL=fuselevel_production BOARDSKU=0001 BOARDREV=H.0 ./build_l4t_bup.sh jetson-xavier mmcblk0p1*
RUN cp bootloader/payloads_t19x/bl_update_payload bootloader/payloads_t19x/bl_update_payload_default*

Copy kernel and modules to final destination.*

RUN tar -C ${ROOTFS_DIR} -cjf ${KERNEL_OUT_DIR}/kernel_supplements.tbz2 lib/modules *
&& cp ${KERNEL_SRC_DIR}/build/arch/arm64/boot/Image ${KERNEL_OUT_DIR}*

Prepare rootfs.*

RUN ./apply_binaries.sh*

Generate OTA update image.*

RUN FAB=400 BOARDID=2888 FUSELEVEL=fuselevel_production BOARDSKU=0001 BOARDREV=H.0 ./build_l4t_bup.sh jetson-xavier mmcblk0p1*

This worked fine with R32.2.1. When I follow all the steps taken by the docker image with updated sources (Tegra186_Linux_R32.4.4_aarch64.tbz2) and filesystem (Tegra_Linux_Sample-Root-Filesystem_R32.4.4_aarch64) I get no build errors. Also when I run nv_update_engine --install with the bl_update_payload in /opt/ota_package it all goes fine but I still get the old kernel after the reboot (nv_update_engine -v confirms that updated slot booted successfully). It seems to me that the bl_update_payload does not contain the updated kernel. Does the kernel image needs to be in some other directory instead of ${KERNEL_OUT_DIR} to be used by ./build_l4t_bup.sh script? Is the bl_update_payload still used to update the kernel image?

Just to be sure here are the steps I am taking:

tar xpf Tegra186_Linux_R32.4.4_aarch64.tbz2
cd Linux_for_Tegra/rootfs and tar xpf …/…/Tegra_Linux_Sample-Root-Filesystem_R32.4.4_aarch64.tbz2
./source_sync.sh -k tegra-l4t-r32.4.4
./rt-patch.sh apply-patches in kernel source dir
mkdir ${KERNEL_SRC_DIR}/build
make ARCH=arm64 O=${KERNEL_SRC_DIR}/build LOCALVERSION=-tegraRT tegra_defconfig
make ARCH=arm64 O=${KERNEL_SRC_DIR}/build LOCALVERSION=-tegraRT CROSS_COMPILE=${CROSS_COMPILE} -j12
make ARCH=arm64 O=${KERNEL_SRC_DIR}/build LOCALVERSION=-tegraRT INSTALL_MOD_STRIP=1 CROSS_COMPILE=${CROSS_COMPILE} INSTALL_MOD_PATH=${ROOTFS_DIR} modules_install
copy Image and modules:
tar -C ${ROOTFS_DIR} -cjf ${KERNEL_OUT_DIR}/kernel_supplements.tbz2 lib/modules
cp ${KERNEL_SRC_DIR}/build/arch/arm64/boot/Image ${KERNEL_OUT_DIR}
run ./apply_binaries.sh in Linux_for_Tegra dir
run FAB=400 BOARDID=2888 FUSELEVEL=fuselevel_production BOARDSKU=0004 BOARDREV=K.0 ./build_l4t_bup.sh jetson-xavier mmcblk0p1

Then i scp payload from Linux_for_Tegra/bootloader/payloads_t19x/bl_update_payload and modules from Linux_for_Tegra/kernel/kernel_supplements.tbz2 to the target board, untar supplements, run update, reboot and get a month old kernel again. Any ideas what needs to be check to make sure I boot the new RT kernel?

Thanks,
Lukas

ShaneCCC · December 16, 2020, 5:59am

Why just copy the Image to Xavier at /boot/Image to apply the kernel Image?

lukaspohl86 · December 16, 2020, 12:00pm

Thanks for the reply, direct replacement of /boot/Image works but I am still unsure what went wrong with nv_update_engine (wrong payload file “bl_update_payload”?, missing argument for Image update?, missplaced Image inside Linux_for_Tegra directory?). I have similar experience with the flash command and I would like to make it work since the goal is to isolate some of the cores and the recommended approach on agx xavier is to add bootargs to flash command like this:
sudo ./flash.sh -k kernel -k kernel-dtb -C “isolcpus=2,3” jetson-xavier mmcblk0p1
I’ve tried the flash command several times, but when I check “/proc/cmdline” the “isolcpus” command is missing and the lscpu says that all cpus are being used. Flash command on host gives no errors, but it also doesn’t seem to update the kernel Image at all.

ShaneCCC · December 16, 2020, 12:21pm

Have a check the Image in the Linux_for_Tegra directory should be know the root cause.

lukaspohl86 · December 17, 2020, 9:21am

I have following update on the problem:
For some reason flashing specific partition with “-k” doesn’t work for me (“-k kernel -k kernel-dtb”). I’ve checked the documentation for agx Partition Configuration and table for agx " Jetson AGX Xavier Series (P2888 and P2888-nnnn), sdmmc_user Partition" does contain “kernel” and “kernel-dtb” partitions. What is also interesting is that the -C parameter also gets ignored when flashing this way. What worked for me was not using “-k” params at all. I also used the capital “-K” parameter with kernel Image location (might not be needed) and -C “isolcpus=2” parameter. This way the flashing process took a bit longer (the whole rootfs was flashed) but resulted in correct kernel being loaded. Kernel bootarg is now present in “/proc/cmdline” .I’ve also checked the /sys/devices/system/cpu/isolated file and it indeed contained correct numbers of isolated cores. So in the end I didn’t find reason for updater not working, I don’t know why “-k” param doesn’t work, but workaround that flashed the whole rootfs worked for me.

ShaneCCC · December 17, 2020, 11:04am

If you flash by -k you need to modify the /boot/extlinux/extlinux.conf to remove below line to apply the Image from the partition. Otherwise you need flash whole system without -k

LINUX /boot/Image

lukaspohl86 · December 18, 2020, 3:42pm

Thank you ShaneCCC for help. I was able to confirm your advice. Unless I want to flash the whole system without -k parameter I need to modify the /boot/extlinux/extlinux.conf file (which I thought was unused by cboot). Now both approaches work. I can either use the update engine with the bl_update_payload file (from ./build_l4t_bup.sh) or I can flash just the kernel partition from host machine. And since the /boot/Image is obviously used by default I can just replace image and modify boot parameters in extlinux.conf per your first suggestion.