Jetson kernel flags for cilium

I recently tried to install cilium on a k8s cluster with an Orin NX. It ultimately failed, due to the lack of a handful of flags (requirements). Would there be any interest in ensuring these are set upstream, or am I on my own to rebuild the kernel?

You probably should post this in the Orin NX forum:
https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-orin-nx/487

Normally one would build kernel modules if available, and if not, then the kernel and modules would need to be built. I don’t think Cilium is common enough that every Jetson should have this. Adding a driver is one of the more basic tasks for Linux development.

Building and installing is usually simpler than what the NVIDIA docs show (at least the installing part). This is because the docs tend to update the flash software, and then flash occurs; if directly adding a module it is a simple file copy. If adding a new base kernel, then it gets more complicated, but it does not normally require flashing. If you are interested in build or install information for this configuration you can ask and help is available.

You might be interested in saving a copy of your existing “/proc/config.gz”. This is not a real file, it is the kernel itself making available its configuration and pretending to be a file. If your kernel is stock, then this is the same as the kernel build target “tegra_defconfig” if you first gunzip a copy of that file and rename it “.config”.

You would also need to write down the existing “uname -r” output. The prefix is the kernel source version, and the suffix is part of the kernel config at the time of compile; this is the CONFIG_LOCALVERSION, and this is not in the config.gz, and is not set via the tegra_defconfig build target. If you are building only modules, then you would match the existing CONFIG_LOCALVERSION (and the default is “-tegra”, so if your existing “uname -r” is “5.15.0-tegra”, then you’d use “-tegra” for CONFIG_LOCALVERSION). This in part determines where the kernel looks for its modules, and can affect whether the modules are accepted in the existing kernel.

With the existing configuration matched, you would then make any edits via the make target of menuconfig (or I use nconfig which is very much the same, but it adds symbol search…all of those config.gz settings are symbols, as are the listed requirements).

Thanks for the details on upgrading! I’ve done some module additions here and there, but never rebuilt the kernel via menuconfig (nconfig).

The kernel flags that are missing to support cilium are…

CONFIG_NET_CLS_BPF=y
CONFIG_NET_SCH_INGRESS=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CGROUP_BPF=y

CONFIG_NETFILTER_XT_TARGET_CT=m

With cilium becoming such a common CNI, I wonder if it’d be worth enabling? I’ll definitely try it myself, but just throwing it out there.

It is useful to know that the config being built will be in the file named “.config”. If you compile directly in your kernel source, the directory containing all of the kernel source can be referred to as the “top”, and has a Makefile there. The .config would also go there, but only if you are building inside of the source. When you use the command line to build with the “O=/some/where”, then it names another location for all of the temporary output and configuration. This latter is highly recommended.

The build target tegra_defconfig simply creates a default .config used for a shipping Jetson (but it does not set CONFIG_LOCALVERSION, which is normally set to “=-tegra”; depending on circumstances you would either match this, or modify this).

Cross compile adds a few things to it for naming the architecture and cross tools, but otherwise is not much different. Official docs for cross compile are fairly easy to follow if you have the correct host PC version of Ubuntu (for cross compile or flash of an Orin you would be best using Ubuntu 20).

If we were to build natively on the Jetson, and not use any cross tools, then it would go something like this (assuming you have sufficient disk space…a lot is required) to build both the kernel Image (has all of the “=y” integrated components) and the kernel modules (has individual files for each “=m” module):

# Create a temporary empty output location, and set an environment variable
# to make it easier to name:
mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`
# Verify:
echo $TEGRA_KERNEL_OUT

# We would also create a temporary empty location for installing modules to:
mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`
# Verify:
echo $TEGRA_MODULES_OUT

# Now go to the top of the kernel source, wherever that might be (it might
# differ from this, it is just an example):
cd /some/where/kernel/kernel-5.10

# For convenience:
export TOP=`pwd`
echo $TOP

# This assumes the directory of output is empty of previous configuration:
make O=$TEGRA_KERNEL_OUT tegra_defconfig

# Now is the part where we edit the configuration with a configuration editor,
# which understands dependencies:
make O=$TEGRA_KERNEL_OUT nconfig

# Note: I use nconfig, many people use menuconfig; they are the same, except
# nconfig has a symbol search. CONFIG_NET_CLS_BPF is an example of a symbol.
# Search understands the leading "CONFIG_", but you could leave that in, and
# search for either "CONFIG_NET_CLS_BPF" or search for "NET_CLS_BPF".
# Actually, it is case insensitive on search, so you could also use lower case.

# For edit, you might need the package "libncurses5-dev" first; in that case:
sudo apt-get install libncurses5-dev

# If you use the "m" key to set a symbol as enabled, then it builds as a module.
# If you use the "y" key to set a symbol as enabled, then it alters the kernel
# Image file itself. You probably should stick to "=m" if it is available. If the feature
# cannot be a module, then the config editor will do nothing when using the
# "m" key; similar, if an integrated feature is not available, then the "y" key will
# do nothing. Try "m" first. If all features can be "m", it will simplify life. If not,
# then you might just use "y" on your additional features since you have to build
# modules and Image for that case.

# This is only needed to propagate configuration if we are not building Image,
# the main kernel, but other than taking time it also does not hurt even if we
# build Image; I only do this if not building Image:
make O=$TEGRA_KERNEL_OUT modules_prepare

# Now to build the main kernel Image of all of the integrated (non-module) features.
# Note that I am assuming your build platform has 12 CPU cores. Adjust this for the
# number of cores for a faster build (but using more RAM) on whatever platform
# you choose. This is the "-j 12" in what follows. For 6 cores, you would use "-j 6".
# You would skip building Image if you have only modules added. Performed from
# $TOP.
make O=$TEGRA_KERNEL_OUT -j 12 Image

# Now to build modules:
make O=$TEGRA_KERNEL_OUT -j 12 modules

# Now to place modules in an empty directory which mirrors how install will
# place them:
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install

# Image is already in the $TEGRA_KERNEL_OUT. Example:
cd $TEGRA_KERNEL_OUT
find . -name Image

There are a lot of variations on the above. For example, some options in kernel build are understood simply by setting the right environment variable, or some options can be added in the command line in different ways. The above did not assume cross compile.

The official docs on kernel build are for cross compile, and are fairly good if you remember the tegra_defconfig target as a default, and that you need to set CONFIG_LOCALVERSION. If you are not using a change in “=y” symbols, then you don’t need to build or install the kernel Image, and life is simplified. In that case you use the original CONFIG_LOCALVERSION (which implies the kernel Image will search for modules in the same place, and find both the old modules plus any new ones you add). Original is:
CONFIG_LOCALVERSION="-tegra"
(you can leave out the quote; in the .config file the editor adds quotes)

If you change the Image by setting or removing anything “=y”, then you should build everything, and change CONFIG_LOCALVERSION. Example:
CONFIG_LOCALVERSION=-cilium

Various official docs tell you how to replace the original kernel. Sometimes it is better to add a renamed alternate kernel, and a second boot entry to test. Or you could place a new Image as the default name, Image, but instead of overwriting the original, move it to a new name. Two examples:

cd /boot
mv Image Image-original
cp /some/where/you/have/it/Image .
# Alternate, keep the original kernel, and then later add a new boot entry for the alternate:
cd /boot
cp /some/where/you/have/it/Image Image-cilium

Orin changes some of the boot setup, so the instructions differ to add alternate boot entries. however, if you have your original Image available for selection, then it is much safer than assuming the new kernel will work and just overwriting it. It is good to always leave the original Image in some form or another (whether renamed or not).

Similar for the modules: Don’t remove the old module directory until you know you don’t need to boot the original Image. If all you’ve done is to create a new module with “=m”, then there is no need to even touch the original Image, and it is a simple file copy with very little risk or complication.

Once you have build down you will want to start a new thread in the Orin NX forum:
https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-orin-nx/487

Be sure to mention that:

  • You changed the Image or not; the equivalent is to state if you’ve changed any “=y” features (meaning you changed the integrated Image), or if you’ve only changed an “=m” feature (meaning you added or removed modules).
  • Mention which L4T release your Orin NX uses, e.g., the output of “head -n 1 /etc/nv_tegra_release”.
  • Mention (if you changed the Image) that you want to keep the original kernel as a backup, and want to create an alternate boot entry for testing the new kernel.
  • Mention that you don’t want to flash the entire system to add your new kernel or modules.

For cross compile, here is a brief addtion to the original script above, but adding the architecture and cross tools (note that cross tools will start there name with “aarch64-linux-gnu-”, and most of the time they are in “/usr/bin”; this is my assumption):

# This is new for cross compile; don't use it in native compile:
export ARCH=arm64

# Assuming cross tools are as in my earlier note:
export CROSS_COMPILE=/usr/bin/aarch64-linux-gnu-

# Create a temporary empty output location, and set an environment variable
# to make it easier to name (use a new and empty directory):
mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`
# Verify:
echo $TEGRA_KERNEL_OUT

# We would also create a temporary empty location for installing modules to:
mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`
# Verify:
echo $TEGRA_MODULES_OUT

# Now go to the top of the kernel source, wherever that might be (it might
# differ from this, it is just an example):
cd /some/where/kernel/kernel-5.10

# For convenience:
export TOP=`pwd`
echo $TOP

# Below this is where we now use $CROSS_COMPILE and $ARCH...but
# remember how I said that there are different ways sometimes to add
# options to the compile command? CROSS_COMPILE and ARCH, when
# exported as an environment variable, are examined by the config code,
# so those will be used in what follows based on those earlier settings
# of ARCH and CROSS_COMPILE.

# This assumes the directory of output is empty of previous configuration:
make O=$TEGRA_KERNEL_OUT tegra_defconfig

# Now is the part where we edit the configuration with a configuration editor,
# which understands dependencies:
make O=$TEGRA_KERNEL_OUT nconfig

# Note: I use nconfig, many people use menuconfig; they are the same, except
# nconfig has a symbol search. CONFIG_NET_CLS_BPF is an example of a symbol.
# Search understands the leading "CONFIG_", but you could leave that in, and
# search for either "CONFIG_NET_CLS_BPF" or search for "NET_CLS_BPF".
# Actually, it is case insensitive on search, so you could also use lower case.

# For edit, you might need the package "libncurses5-dev" first; in that case:
sudo apt-get install libncurses5-dev

# If you use the "m" key to set a symbol as enabled, then it builds as a module.
# If you use the "y" key to set a symbol as enabled, then it alters the kernel
# Image file itself. You probably should stick to "=m" if it is available. If the feature
# cannot be a module, then the config editor will do nothing when using the
# "m" key; similar, if an integrated feature is not available, then the "y" key will
# do nothing. Try "m" first. If all features can be "m", it will simplify life. If not,
# then you might just use "y" on your additional features since you have to build
# modules and Image for that case.

# This is only needed to propagate configuration if we are not building Image,
# the main kernel, but other than taking time it also does not hurt even if we
# build Image; I only do this if not building Image:
make O=$TEGRA_KERNEL_OUT modules_prepare

# Now to build the main kernel Image of all of the integrated (non-module) features.
# Note that I am assuming your build platform has 12 CPU cores. Adjust this for the
# number of cores for a faster build (but using more RAM) on whatever platform
# you choose. This is the "-j 12" in what follows. For 6 cores, you would use "-j 6".
# You would skip building Image if you have only modules added. Performed from
# $TOP.
make O=$TEGRA_KERNEL_OUT -j 12 Image

# Now to build modules:
make O=$TEGRA_KERNEL_OUT -j 12 modules

# Now to place modules in an empty directory which mirrors how install will
# place them:
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install

# Image is already in the $TEGRA_KERNEL_OUT. Example:
cd $TEGRA_KERNEL_OUT
find . -name Image

Just to reiterate, official docs tell you have to replace the kernel and modules in the flash software, and how to flash to get the new content. There are simple ways to add content without flashing. The cross compile official docs are good and easy to follow, but you probably want to ask under the Orin NX forum about install steps after mentioning what you changed in the kernel.

All of this seems like a lot of detail, but the gist is that if you’ve only added modules, then you copy the modules to the correct place and you’re done (you’d take steps like reboot or use of depmod/modprobe to manually load modules). If you’ve done this once, and have your kernel source in place, it becomes trivial to add a driver.

1 Like

Amazing writeup! Thank you again, @linuxdev =)

Finally getting around to trying this. Got hit with some errors.

dudo@jetson1:/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10$ make O=$TEGRA_KERNEL_OUT tegra_defconfig
make[1]: Entering directory '/home/dudo/kernel'
***
*** The source tree is not clean, please run 'make mrproper'
*** in /usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10
***
make[1]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/Makefile:577: outputmakefile] Error 1
make[1]: Leaving directory '/home/dudo/kernel'
make: *** [Makefile:213: __sub-make] Error 2

So I did what it asked…

dudo@jetson1:/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10$ sudo make mrproper
  CLEAN   arch/arm64/kernel/vdso
  CLEAN   scripts/basic
  CLEAN   scripts/dtc
  CLEAN   scripts/genksyms
  CLEAN   scripts/kconfig
  CLEAN   scripts/mod
  CLEAN   scripts/selinux/genheaders
  CLEAN   scripts/selinux/mdp
  CLEAN   scripts
  CLEAN   include/config include/generated arch/arm64/include/generated .config .config.old Module.symvers

Which… seems to have deleted my .config file? Feels like that was a mistake…

Upon running make again, I seem to be missing some drivers.

dudo@jetson1:/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10$ make O=$TEGRA_KERNEL_OUT tegra_defconfig
make[1]: Entering directory '/home/dudo/kernel'
  GEN     Makefile
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/confdata.o
  HOSTCC  scripts/kconfig/expr.o
  LEX     scripts/kconfig/lexer.lex.c
  YACC    scripts/kconfig/parser.tab.[ch]
  HOSTCC  scripts/kconfig/lexer.lex.o
  HOSTCC  scripts/kconfig/parser.tab.o
  HOSTCC  scripts/kconfig/preprocess.o
  HOSTCC  scripts/kconfig/symbol.o
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
drivers/video/Kconfig:27: can't open file "drivers/video/tegra/Kconfig"
make[2]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/scripts/kconfig/Makefile:89: tegra_defconfig] Error 1
make[1]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/Makefile:633: tegra_defconfig] Error 2
make[1]: Leaving directory '/home/dudo/kernel'
make: *** [Makefile:213: __sub-make] Error 2

I found some references to source_sync.sh on the forums, which I downloaded, but I’m not sure where to run it, or if I need to pass a specific tag. Am I on the right track, or should I just cherry pick drivers from Driver Package (BSP) Sources? I just dug through those files, and kernel_src/kernel/kernel-5.10/drivers/video doesn’t have a tegra dir? There’s a Makefile with bj-y += tegra/, but this is the first time I’ve seen a fragment like this, and I’m not sure how to use it.

Any help is appreciated. Thank you!

Do you have the full source at that location, and not just headers? If so, then you can run “sudo make mrproper” as a starting point, and then, when used, add the compile using “O=$TEGRA_KERNEL_OUT” (which points somewhere not owned by root). The goal is to keep the source tree itself pristine in most cases. There are some exceptions though when you are using this from outside source code (outside source code for modules needs to build against a kernel which is configured to match the running system; when treated as just headers things can be different than when treated as full source).

I’m not sure how to know if it’s full source vs headers. I originally flashed the Jetson via these commands from a ubuntu 20.04 box.

cd ~  
tar xpf Downloads/Jetson_Linux_R35.4.1_aarch64.tbz2
sudo tar xpf Downloads/Tegra_Linux_Sample-Root-Filesystem_R35.4.1_aarch64.tbz2 -C Linux_for_Tegra/rootfs/
sed -i 's/cvb_eeprom_read_size = <0x100>/cvb_eeprom_read_size = <0x0>/g' Linux_for_Tegra/bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3767-0000.dts

cd Linux_for_Tegra/  
sudo ./apply_binaries.sh  
sudo ./tools/l4t_flash_prerequisites.sh
sudo ./tools/l4t_create_default_user.sh -u dudo -p topsecretpasswd -a -n jetson1
sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 -c tools/kernel_flash/flash_l4t_external.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi.xml" --showlogs --network usb0 jetson-orin-nano-devkit internal

Well, technically, I installed 35.3.1, and updated to 35.4.1 in place via this.

Anyway, I ran what you said beforehand, from $TOP

sudo make mrproper
sudo make O=$TEGRA_KERNEL_OUT tegra_defconfig

Which errors out with

drivers/video/Kconfig:27: can't open file "drivers/video/tegra/Kconfig"
make[2]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/scripts/kconfig/Makefile:89: tegra_defconfig] Error 1
make[1]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/Makefile:633: tegra_defconfig] Error 2
make[1]: Leaving directory '/home/dudo/kernel'
make: *** [Makefile:213: __sub-make] Error 2

The reason I mentioned full headers is because this is normally (but not always) the case when I see:

The source tree is not clean, please run 'make mrproper'

Headers can also be configured, but most of the time configuration only occurs in full source. The mrproper comment is still valid.

The manual unpacking and install should work. This is provided that (A) the original update from R35.3.1 to R35.4.1 worked correctly, and (B) any source code for the kernel and installer commands are also from R35.4.1 (and I see the installer is itself from R35.4.1). The only part I don’t know yet, but which might have an effect, is where you downloaded the kernel source from? It should be R35.4.1 kernel source downloaded from NVIDIA.

This latter point is mentioned because I don’t see where you downloaded or unpacked the source, but even so, I do see a configuration issue. Plus, the error you found here:

drivers/video/Kconfig:27: can't open file "drivers/video/tegra/Kconfig"

…basically suggests that the source is not from NVIDIA. Where did you get the content in “$TOP”? Is this cross compile or native compile (you are installing from a desktop host PC, but I don’t know where the compile is from; perhaps it is from the same host PC)?

Note: Here is the location to find source against your R35.4.1:
https://developer.nvidia.com/linux-tegra

Source is from the drivers section of Jetson Linux 35.3.1 | NVIDIA Developer

As for compilation, It was done on a 20.04 ubuntu box (not VM) with the Jetson connected via USB.

I don’t know what difference there is between the 35.3.1 and 35.4.1 kernel source, but there are typically a lot of kernel fixes between releases for Orin in recent times, so I suggest using R35.4.1 source.

Is the cillium from that source? It shouldn’t be using “/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64” if it is directly from the kernel source in a full source install. I could see a mistake occurring in cross-compile if cross tools were not used. Are you exporting “ARCH”? If so, is it “ARCH=arm64”? What is your “CROSS_COMPILE” value? What do you see from:
${CROSS_COMPILE}gcc --version

ARCH and CROSS_COMPILE are unset. There is no cilium yet - not installing it until the flags are enabled.

I was running these commands on the Orin itself, so not following the cross-compile steps you gave. I’m kind of confused there. I suppose I installed it as cross-compile originally, since it was an x86_64 ubuntu box loading code onto an arm64 ubuntu Orin box? Does that matter for how I should run things directly on the Orin now?

~$ ${CROSS_COMPILE}gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

If you do not cross compile, then you are correct to not set those. Sometimes people use ARCH when natively compiling, and this would be incorrect and cause a failure. However, if you have the full source installed natively on the Jetson, then it should never try to use the separate header location (during install it might write a symbolic link pointing there, but it would not read, nor modify, “/usr/src”).

If you copied source from a host PC to the Jetson, and there was some configuration left over, then this would cause such an issue. Can you try again after:

  • Unpacking R35.4.1 source to the Jetson (you can unpack the package containing the package on a host PC, but leave the actual source as a single archive before copy to the Jetson; or just do that on the Jetson, but it takes a lot of disk space).
  • During unpack do so as root (use sudo). As root (sudo), at the “$TOP” of the kernel source, run “sudo make mrproper”.
  • Do not set ARCH, and do not set CROSS_COMPILE.
  • Try the build as a non-root (regular) user with “O=/some/where” in all steps (and that location is writable by your regular user; this is “$TEGRA_KERNEL_OUT”, which would be named in build lines as “O=$TEGRA_KERNEL_OUT”).
  • The first actual build command would be (if $TEGRA_KERNEL_OUT is set; no sudo):
    make O=$TEGRA_KERNEL_OUT tegra_defconfig

If the above fails, then create a log:
make O=$TEGRA_KERNEL_OUT tegra_defconfig 2>&1 | tee log_defconfig.txt

Post that log if it fails.

It worked!! I struggled a bit on how to get the source code, then a bit more on where to put it. I tried replacing files ad hoc for a few attempts until I made the connection that I could (and should) copy in the entire source. mrproper (I can’t help but think of a cheap rip off of drpepper every time I say that, lol) stopped working once I updated the the source. Something about phoenix source or something? My terminal history doesn’t go back that far after making image/modules.

This is what ended up working for me.

mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`

mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`

export TOP=/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10

cd ~
vim source_sync.sh
chmod +x source_sync.sh
./source_sync.sh -h
./source_sync.sh -k -t jetson_35.4.1
sudo cp -r ~/sources/kernel/* /usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64

cd $TOP
make O=$TEGRA_KERNEL_OUT tegra_defconfig
make O=$TEGRA_KERNEL_OUT nconfig
make O=$TEGRA_KERNEL_OUT -j 8 Image
make O=$TEGRA_KERNEL_OUT -j 8 modules
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install

nconfig is not for the faint of heart… I see what you mean, though, the symbol search is necessary. Took me an embarrassing amount of time to figure out space bar cycled the values, instead of pressing Y or M.

I posted over in the other forum, per your suggestion, and was told to rtfm. Please send me your tip jar info, @linuxdev. You’re actually giving support, and I appreciate you.

I did some googling, and this looks compelling. This seems to be less Jetson specific, and more just u-boot, eh?

Could you help me get over the finish line, here?

source_sync.sh (11.0 KB)

The nice thing is that once you’ve done this the first time, you have the source, and the next time everything is less confusing. I would not mind seeing nconfig get an improved symbol search that can navigate to that symbol.

I’m not sure which part this is for:

Do you mean with kernel install? If so, then this changes depending on a lot of things. Is this an Orin NX dev kit? Or third party carrier board from RidgeRun? If from NVIDIA, having an eMMC model changes compared to SD card model; if the carrier board differs, then some other details might differ. Incidentally, there are also differences between L4T R32.x and L4T R35.x+ (see “head -n 1 /etc/nv_tegra_release” to find the L4T release) due to migrating to UEFI boot (which might make changing the “/boot/extlinux/extlinux.conf” change).

Some details which might or might not matter: eMMC models have content equivalent to the BIOS, plus the boot content, in partitions. SD card models have that same content in QSPI memory on the module itself since there is no eMMC (there might be some QSPI on some eMMC models as well, and I’m not clear on what that is used for when present on eMMC models). When you flash as a whole, you are also flashing bootloader and BIOS (the equivalent). Changing something related to boot requires knowing details about the model and the L4T release.

The gist is that you can usually leave the original kernel Image in place, and add a second boot entry if you are adding a new Image. That makes testing safer. On the other hand, if you are only adding a module, then it is trivial since it is just a mostly risk-free file copy. If you only modified by adding a module, then you are almost done. If you modified the kernel Image (which is what loads modules), then you’ll need to install that and all new modules. For the former case of no Image change, you want your CONFIG_LOCALVERSION to remain unchanged, but in the latter case of a change to the Image, you want a new CONFIG_LOCALVERSION if you are going to add an alternate boot entry.

CONFIG_LOCALVERSION is appended to the source code version when you run the command “uname -r”. “uname -r” is part of the search path the Image uses to find modules. Modules are in a subdirectory of:
/lib/modules/$(uname -r)/kernel

The default CONFIG_LOCALVERSION is “-tegra”. If your build is just to add a module, then this is what you use:
CONFIG_LOCALVERSION="-tegra"

I will point out that in the “.config” file this is one of the few parameters which have no dependency. You could symbol search and set “-tegra” with nconfig, or you could directly edit the .config. I will warn you though that if your .config is not exactly what you want, and you edit this, that you should completely delete your temporary output location and start from scratch.

If you are going to modify the Image itself (change of a symbol with the “=y” instead of “=m”), things get more complicated. You’d want to build with a CONFIG_LOCALVERSION using something custom. An example might be any of:

CONFIG_LOCALVERSION=-cilium
CONFIG_LOCALVERSION=-tegra-modified
CONFIG_LOCALVERSION=-custom

(anything different)

Then you might rename the kernel Image as something related, e.g., one of these examples:

Image-cilium
Image-tegra-modified
Image-custom

That way you keep the original entry. But I don’t know:

  • What your hardware is (SD card dev kit versus eMMC module).
  • I don’t know the L4T release (“head -n 1 /etc/nv_tegra_release”).
  • I don’t know if this is a third party carrier board.
  • And I especially don’t know if you modified the Image or just added a module.

Incidentally, on eMMC models there are multiple possible sources for the kernel Image. One is the file pointed at in extlinux.conf, and another is in a kernel partition. One is added via flash, the other is just a file copy (I shouldn’t say “just” because it is a lot of file copies for Image). The partition version is signed during a flash, and this works even when security fuses have been burned. A separate Image file takes precedence unless security fuses are burned (at which point much of the content only works from signed partitions).

Incidentally, U-Boot has not been used for some time. Its last use was in the earlier releases of L4T R32.x, but this then was removed, and CBoot had U-Boot features merged directly into CBoot (or at least a subset of features). In R34.x+ all boot content was migrated to a UEFI boot (which is a big help for the future I think).

Let me know some details of what you need to accomplish and I’ll try to help.

Sadly, I have no tip jar. I do have an emoji jar! 😁

DM me your venmo or something. I owe you a drink! 🍻

I think I only have the kernel install left to do. Replacing /boot/Image seems too easy, though… I have a $TEGRA_KERNEL_OUT directory with a new Image ready to go.

To answer your questions:

  • This is an Orin NX 16gb with everything installed on an NVMe SSD (via this guide)
  • R35 (release), REVISION: 4.1, GCID: 33958178, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 1 19:57:35 UTC 2023
  • Third party Turing Pi 2 carrier board
  • I changed 4 features to Y, and 1 to M, so the Image was/is modified

This would be an eMMC model, but a custom device tree. Either way, I must warn you that I have not tested an external drive install, so I am guessing about parts of this. There are a lot of unknowns I’m guessing on, so I could end up confusing you more than helping. I can describe some concepts, though I fear I’ll just confuse you.

I’ll simplify and state part of this ahead of time: You’ve changed the “=y” features, so you need to install both kernel Image and all modules. You’ll want a new CONFIG_LOCALVERSION, resulting in a new “uname -r” and module search path (“/lib/modules/$(uname -r)/kernel”). The difficult choice will be between overwriting the existing kernel (modules won’t overwrite because the search path will have changed) or adding a new boot entry (the latter is preferred).

The third party carrier board makes my help even less relevant since the tools I work with are all the NVIDIA tools, and third parties tend to modify flash tools. I’ll try to describe choices you might be working with, but I don’t think I can do a very good job for this particular hardware. I’m going to guess that you’ll need to start a new thread on the install part, and maybe ask the third party manufacturer (especially with regard to adding new kernel boot entries without removing the original), but hopefully this gives a starting point. Everything which follows is more or less random background, and not direct instructions (which I have no way to provide).

eMMC model Jetsons tend to have a requirement that flash provides some initial content which brings up clocks and power rails (the part which is equivalent to a BIOS). Eventually it points to the “/boot” content of the eMMC, at least most of the time. Then the extlinux.conf of the rootfs partition tends to name its own partition for rootfs in a direct load. Alternatively, that extlinux.conf can name another rootfs. If your kernel Image (preferably renamed after the new CONFIG_LOCALVERSION) is on the eMMC, then not much needs to be modified. You’d also typically install all of the new modules (because of the new CONFIG_LOCALVERSION, which in turn is because you don’t want to reuse modules after a modification of the “=y” options) to “/lib/modules/$(uname -r)/kernel”.

One complication to the above is that if you use an initial ramdisk (initrd), then typically a subset of the modules are also added to the initrd since the initrd is usually an adapter for loading the next filesystem. If the kernel has all content needed to load that filesystem, then really no modules are needed in the initrd (and in fact the initrd probably isn’t needed). Examples of cases which might need a subset of modules in the initrd: A logical volume manager; encryption; a filesystem type not directly supported in the Image. It is possible that you are using an initrd to load the NVMe. Not due to drivers, but purely as a way to abstract the pivot_root (you will tend to see references to “Linux_for_Tegra/tools/kernel_flash/l4t_initrd_flash.sh” when working with external media). When you don’t have an actual BIOS an initrd can take over part of the optional behavior.

That chain might mean that the Image, using a modified name (I am assuming adding a new boot entry and not replacing the old one; this is less complicated if you don’t care about saving the old Image, but if something goes wrong and you have the old image, then it serves to rescue without reflashing; modules don’t generally have that risk). Let’s say you’ve used “CONFIG_LOCALVERSION=-cilium”, and now you “uname -r” will be “5.10.120-cilium”; then you might name the kernel “Image-cilium”. Your modules will need to be located at “/lib/modules/5.10.120-cilium/kernel”. But on which device? A checklist follows.

It does not hurt to install to too many devices. This can in fact work out well in the future if some device fails and you end up with the ability to boot to a different device, or just want a reference copy. Here are some places where the Image-cilium might be needed:

  • The eMMC “/boot
  • The NVMe “/boot
    (note that the Image does not need to be added to an initrd)

Here are places where modules might be needed:

  • The eMMC “/lib/modules/5.10.120-cilium/kernel”.
  • The NVMe “/lib/modules/5.10.120-cilium/kernel”.
  • The initrd “/lib/modules/5.10.120-cilium/kernel”.
    (this last one is new; the subset is determined by what is needed to reach the next filesystem, and in your case, it probably won’t be a requirement)

Here are places where extlinux.conf might need to be modified (because I’m assuming adding a boot entry, and not replacing existing content; you could just name this “Image” and install modules at required locations and skip any extlinux.conf edits):

  • The eMMC “/boot/extlinux/extlinux.conf”.
  • The NVMe “/boot/extlinux/extlinux.conf”.
  • The initrd “/boot/extlinux/extlinux.conf
    (or, indirectly, the initrd might edit the next extlinux.conf in the boot chain)

As to which location actually requires content, the Image (or Image-cilium) must be on the first “/boot” read in the chain. It is advisable for that same Image to be on the final device, but once the Image is loaded, no other location needs the Image. The same is true for device tree, but normally only the first device reads the device tree (it is still advisable to add the tree to the final media).

My advice is to ask the manufacturer how to add a second boot entry (in addition to the original) for the NVMe work flow. Alternately, you’ll need to ask (for your model) how to add kernel Image and modules for the NVMe case on their carrier board.

I’ve successfully tested out the new image/modules!!

I just re-read your original answer, and everything is so much more clear now 😅. Thank you so much for the overabundance of information! I’ve learned a ton about linux during this process.

I started from scratch, kinda, after being pointed to Kernel Customization. It seems as though that’s geared towards external flashing, but what stood out to me was nvbuild.sh. I don’t think I need steps 5-7? I modified nvbuild.sh to run nconfig, append a suffix to LOCALVERSION, and to run modules_install.

I put together the steps I followed in this Gist.

I discovered kexec when asking for help back on the turing pi discord, and after a bit of struggling (forgot to update /lib/modules), I finally got it to boot!

$ uname -a
Linux jetson1 5.10.120-tegra-cilium #1 SMP PREEMPT Mon Aug 28 01:25:02 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

I’m not sure which are the steps 5 through 7? If you mean the dtbs target during kernel build, then that is not needed unless you are altering some non-plug-n-play device.

The modules not being updated or being in the wrong place has a high rate of occurrence in kernel install issues (not just on Jetsons). The relationship between “uname -r”, module search path, and CONFIG_LOCALVERSION is such a small thing, but it is a critical part which doesn’t “seem” to look important.

Glad it works for you now though.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.