Flash jetson Nano after kernel rebuild

rogerpease · November 13, 2020, 3:00pm

I have a 945-13450-0000-100 Jetson Nano Developer kit. It has two CSI-2 slots… The module is a P3448 1B0-13448-DAAA-B01. The Carrier board is a P3450. top reports 4 GB of memory. From what I’ve read I gather this is a B01 module.

If I flash it with SDKManager 1.3.1.7110 (JetPack 4.4.1) it works fine (selecting Jetson Nano Devkit, not the Jetson Nano or Jetson Nano 2GB). I put it into Force Recovery mode by shorting pins 9&10 on the Module. I am also using a bench power supply (5V, 10A capable).

I then downloaded the kernel and built it on my ubuntu:

JETSON_NANO_KERNEL_SOURCE=$(pwd)
TOOLCHAIN_PREFIX=$HOME/l4t-gcc/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-
TEGRA_KERNEL_OUT=$JETSON_NANO_KERNEL_SOURCE/build 
KERNEL_MODULES_OUT=$JETSON_NANO_KERNEL_SOURCE/modules 
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} tegra_defconfig
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} -j8 --output-sync=target zImage
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} -j8 --output-sync=target modules 
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} -j8 --output-sync=target dtbs
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra INSTALL_MOD_PATH=$KERNEL_MODULES_OUT modules_install

And then following some directions on ridgerun attempted to flash the device:

JETSON_NANO_KERNEL_SOURCE=$PWD
export NVIDIA_SDK_MANAGER=$HOME/nvidia/nvidia_sdk/
export DTB=tegra210-p3448-0000-p3449-0000-b00.dtb

JETPACK_4_4_1=${NVIDIA_SDK_MANAGER}/JetPack_4.4.1_Linux_JETSON_NANO_DEVKIT/Linux_for_Tegra/

cd ${JETPACK_4_4_1}
cp $JETSON_NANO_KERNEL_SOURCE/build/arch/arm64/boot/Image kernel/
cp $JETSON_NANO_KERNEL_SOURCE/build/arch/arm64/boot/dts/${DTB} kernel/dtb/

sudo ./flash.sh jetson-nano-qspi-sd mmcblk0p1

What happens: After flashing it reboots and goes to ubuntu setup (user name, timezone, etc). All that seems to go fine but then it tries to power up the normal user interface and that never starts (saying something about the window manager not being available and being unable to start nvpmodel service).

I am guessing one of three things is wrong here:

I have the wrong dtb file selected.
I have the wrong flash programmer selected.
I have a mismatch between the kernel I built and the root filesystem that Jetpack 4.4.1 is using. Both are very recent.

rpease@ubuntu:~/JetsonNano$ sha256sum public_sources.tbz2
201f8f55b6e02421d463b5dd17861e02027fe8a3939ee2a02fb2720f8e4446fd public_sources.tbz2

Any ideas?

Thanks,

Roger

linuxdev · November 13, 2020, 6:10pm

Did you set CONFIG_LOCALVERSION? Basically the command “uname -r” produces something like “4.9.140-tegra”, and the “4.9.140” is the software release of the kernel, while the “-tegra” is from the CONFIG_LOCALVERSION="-tegra" at the time of compile. The resulting uname -r is used by the kernel to find modules.

If your modules are in “/lib/modules/4.9.140-tegra/”, and you built a new kernel without CONFIG_LOCALVERSION, then it will try to find those modules in “/lib/modules/4.9.140/”. Failing to install modules at the new location means all modules would be missing. If a module happens to be for the video driver, and then goes to a graphical mode requiring a module, then graphical mode will fail.

CONFIG_LOCALVERSION is simple to set up since it has no dependencies. You could use a menu based config editor, or after you performed the make tegra_defconfig, you could edit the resulting .config directly and make sure you have this:
CONFIG_LOCALVERSION="-tegra"
(in your case if your carrier board uses a different suffix, then you’d adjust for that…the “-tegra” is the default for the NVIDIA-supplied use case)

rogerpease · November 15, 2020, 12:04am

Thanks for the reply. I hadn’t set CONFIG_LOCALVERSION=“-tegra”.

However, I added a step where I edit build/.config and set CONFIG_LOCALVERSION=“-tegra” and that didn’t fix the issue.

I mounted the SD card and /lib/modules had a 4.9.140-tegra directory. So I symbolically linked 4.9.140 to it and that didn’t resolve the issue. It still does initial boot up with many services starting but reports failed on “nvpmodel”

Any other thoughts? Bad DTB? I’ve cleaned and reinstalled everything twice so it’s not a corrupted directory.

Thanks,

linuxdev · November 15, 2020, 6:53pm

Modules are compiled to know what their version is, and unless you’ve done something special to allow incorrect versioning to work, then symbolic links won’t work. There are conditions when they will work, but it sounds like this is not the case. I suggest build the kernel again with its existing configuration, but first edit CONFIG_LOCALVERSION. Then replace that kernel.

You can skip the “make tegra_defconfig” and other parts if you are running that kernel and copy the “/proc/config.gz” to your build area. You’d “gunzip config.gz”, then change its name to “.config”, and finally use an editor to alter the file:
CONFIG_LOCALVERSION="-tegra"

Until you know your modules are correct you cannot really continue. The device tree (DTB file) is merely an abstract mechanism to pass arguments to drivers without compiling those hardware-dependent arguments directly into the source code. If the driver does not load, then the DTB can have no influence since there are no arguments to a missing driver. Bad driver version matching can cause “no driver”.

AnishAney · November 16, 2020, 1:17pm

I concur with @linuxdev here. You do need modules (and, maybe initrd) in order for the device to boot.

BTW, why are you trying to re-flash the system with the custom (or rebuilt) kernel?

Once you compile the kernel source on your development box as you have done now, scp over $KERNEL_MODULES_OUT/lib/modules/ to /lib/modules/ on the target system.
Build initrd.img using update-initramfs -c -k <new-kernel-version> on the target system.
scp over $JETSON_NANO_KERNEL_SOURCE/build/arch/arm64/boot/Image to the target system as /boot/custom-Image
Edit /boot/extlinux/extlinux.conf file to add your new entry. NVIDIA has provided sample template for it in the same file.

You may not need DTB entry (FDT) as you are using the same board and peripherals. Replace LINUX entry with the path of /boot/custom-Image and INITRD entry with the path of the generated initrd.img-<new-kernel-version>. Reboot the system, and select the new entry when prompted with the options.

This way, you won’t have to worry about a bad kernel image being built, as you can always revert back to the original kernel by selecting it on the next boot.

For the current situation, I would suggest booting back into the original kernel image i.e. re-flash the system with the SDKManager after removing the directory so that you have a working system again. Another solution could be to copy all the modules from $KERNEL_MODULES_OUT/lib/modules/ to ${JETPACK_4_4_1}/rootfs/lib/modules/, and then re-flash the system as you have been doing.

sudo ./flash.sh jetson-nano-qspi-sd mmcblk0p1

rogerpease · November 16, 2020, 11:40pm

Thanks for the notes. I am reflashing because I need to add an I2C device which isn’t discoverable. As a baseline I decided to recompile the raw kernel (good thing I did) to make sure I didn’t have any issues.

I went as far as reflashing my PC to ubuntu to avoid running any virtual environments and I’m still having issuese . Will dig into it as I get time.