I recently tried to install cilium on a k8s cluster with an Orin NX. It ultimately failed, due to the lack of a handful of flags (requirements). Would there be any interest in ensuring these are set upstream, or am I on my own to rebuild the kernel?
You probably should post this in the Orin NX forum:
https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-orin-nx/487
Normally one would build kernel modules if available, and if not, then the kernel and modules would need to be built. I don’t think Cilium is common enough that every Jetson should have this. Adding a driver is one of the more basic tasks for Linux development.
Building and installing is usually simpler than what the NVIDIA docs show (at least the installing part). This is because the docs tend to update the flash software, and then flash occurs; if directly adding a module it is a simple file copy. If adding a new base kernel, then it gets more complicated, but it does not normally require flashing. If you are interested in build or install information for this configuration you can ask and help is available.
You might be interested in saving a copy of your existing “/proc/config.gz
”. This is not a real file, it is the kernel itself making available its configuration and pretending to be a file. If your kernel is stock, then this is the same as the kernel build target “tegra_defconfig
” if you first gunzip
a copy of that file and rename it “.config
”.
You would also need to write down the existing “uname -r
” output. The prefix is the kernel source version, and the suffix is part of the kernel config at the time of compile; this is the CONFIG_LOCALVERSION
, and this is not in the config.gz
, and is not set via the tegra_defconfig
build target. If you are building only modules, then you would match the existing CONFIG_LOCALVERSION
(and the default is “-tegra
”, so if your existing “uname -r
” is “5.15.0-tegra
”, then you’d use “-tegra
” for CONFIG_LOCALVERSION
). This in part determines where the kernel looks for its modules, and can affect whether the modules are accepted in the existing kernel.
With the existing configuration matched, you would then make any edits via the make target of menuconfig
(or I use nconfig
which is very much the same, but it adds symbol search…all of those config.gz
settings are symbols, as are the listed requirements).
Thanks for the details on upgrading! I’ve done some module additions here and there, but never rebuilt the kernel via menuconfig (nconfig).
The kernel flags that are missing to support cilium are…
CONFIG_NET_CLS_BPF=y
CONFIG_NET_SCH_INGRESS=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CGROUP_BPF=y
CONFIG_NETFILTER_XT_TARGET_CT=m
With cilium becoming such a common CNI, I wonder if it’d be worth enabling? I’ll definitely try it myself, but just throwing it out there.
It is useful to know that the config being built will be in the file named “.config
”. If you compile directly in your kernel source, the directory containing all of the kernel source can be referred to as the “top”, and has a Makefile
there. The .config
would also go there, but only if you are building inside of the source. When you use the command line to build with the “O=/some/where
”, then it names another location for all of the temporary output and configuration. This latter is highly recommended.
The build target tegra_defconfig
simply creates a default .config
used for a shipping Jetson (but it does not set CONFIG_LOCALVERSION
, which is normally set to “=-tegra
”; depending on circumstances you would either match this, or modify this).
Cross compile adds a few things to it for naming the architecture and cross tools, but otherwise is not much different. Official docs for cross compile are fairly easy to follow if you have the correct host PC version of Ubuntu (for cross compile or flash of an Orin you would be best using Ubuntu 20).
If we were to build natively on the Jetson, and not use any cross tools, then it would go something like this (assuming you have sufficient disk space…a lot is required) to build both the kernel Image
(has all of the “=y
” integrated components) and the kernel modules (has individual files for each “=m
” module):
# Create a temporary empty output location, and set an environment variable
# to make it easier to name:
mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`
# Verify:
echo $TEGRA_KERNEL_OUT
# We would also create a temporary empty location for installing modules to:
mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`
# Verify:
echo $TEGRA_MODULES_OUT
# Now go to the top of the kernel source, wherever that might be (it might
# differ from this, it is just an example):
cd /some/where/kernel/kernel-5.10
# For convenience:
export TOP=`pwd`
echo $TOP
# This assumes the directory of output is empty of previous configuration:
make O=$TEGRA_KERNEL_OUT tegra_defconfig
# Now is the part where we edit the configuration with a configuration editor,
# which understands dependencies:
make O=$TEGRA_KERNEL_OUT nconfig
# Note: I use nconfig, many people use menuconfig; they are the same, except
# nconfig has a symbol search. CONFIG_NET_CLS_BPF is an example of a symbol.
# Search understands the leading "CONFIG_", but you could leave that in, and
# search for either "CONFIG_NET_CLS_BPF" or search for "NET_CLS_BPF".
# Actually, it is case insensitive on search, so you could also use lower case.
# For edit, you might need the package "libncurses5-dev" first; in that case:
sudo apt-get install libncurses5-dev
# If you use the "m" key to set a symbol as enabled, then it builds as a module.
# If you use the "y" key to set a symbol as enabled, then it alters the kernel
# Image file itself. You probably should stick to "=m" if it is available. If the feature
# cannot be a module, then the config editor will do nothing when using the
# "m" key; similar, if an integrated feature is not available, then the "y" key will
# do nothing. Try "m" first. If all features can be "m", it will simplify life. If not,
# then you might just use "y" on your additional features since you have to build
# modules and Image for that case.
# This is only needed to propagate configuration if we are not building Image,
# the main kernel, but other than taking time it also does not hurt even if we
# build Image; I only do this if not building Image:
make O=$TEGRA_KERNEL_OUT modules_prepare
# Now to build the main kernel Image of all of the integrated (non-module) features.
# Note that I am assuming your build platform has 12 CPU cores. Adjust this for the
# number of cores for a faster build (but using more RAM) on whatever platform
# you choose. This is the "-j 12" in what follows. For 6 cores, you would use "-j 6".
# You would skip building Image if you have only modules added. Performed from
# $TOP.
make O=$TEGRA_KERNEL_OUT -j 12 Image
# Now to build modules:
make O=$TEGRA_KERNEL_OUT -j 12 modules
# Now to place modules in an empty directory which mirrors how install will
# place them:
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install
# Image is already in the $TEGRA_KERNEL_OUT. Example:
cd $TEGRA_KERNEL_OUT
find . -name Image
There are a lot of variations on the above. For example, some options in kernel build are understood simply by setting the right environment variable, or some options can be added in the command line in different ways. The above did not assume cross compile.
The official docs on kernel build are for cross compile, and are fairly good if you remember the tegra_defconfig
target as a default, and that you need to set CONFIG_LOCALVERSION
. If you are not using a change in “=y
” symbols, then you don’t need to build or install the kernel Image
, and life is simplified. In that case you use the original CONFIG_LOCALVERSION
(which implies the kernel Image
will search for modules in the same place, and find both the old modules plus any new ones you add). Original is:
CONFIG_LOCALVERSION="-tegra"
(you can leave out the quote; in the .config
file the editor adds quotes)
If you change the Image
by setting or removing anything “=y
”, then you should build everything, and change CONFIG_LOCALVERSION
. Example:
CONFIG_LOCALVERSION=-cilium
Various official docs tell you how to replace the original kernel. Sometimes it is better to add a renamed alternate kernel, and a second boot entry to test. Or you could place a new Image
as the default name, Image
, but instead of overwriting the original, move it to a new name. Two examples:
cd /boot
mv Image Image-original
cp /some/where/you/have/it/Image .
# Alternate, keep the original kernel, and then later add a new boot entry for the alternate:
cd /boot
cp /some/where/you/have/it/Image Image-cilium
Orin changes some of the boot setup, so the instructions differ to add alternate boot entries. however, if you have your original Image
available for selection, then it is much safer than assuming the new kernel will work and just overwriting it. It is good to always leave the original Image
in some form or another (whether renamed or not).
Similar for the modules: Don’t remove the old module directory until you know you don’t need to boot the original Image
. If all you’ve done is to create a new module with “=m
”, then there is no need to even touch the original Image
, and it is a simple file copy with very little risk or complication.
Once you have build down you will want to start a new thread in the Orin NX forum:
https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-orin-nx/487
Be sure to mention that:
- You changed the
Image
or not; the equivalent is to state if you’ve changed any “=y
” features (meaning you changed the integratedImage
), or if you’ve only changed an “=m
” feature (meaning you added or removed modules). - Mention which L4T release your Orin NX uses, e.g., the output of “
head -n 1 /etc/nv_tegra_release
”. - Mention (if you changed the
Image
) that you want to keep the original kernel as a backup, and want to create an alternate boot entry for testing the new kernel. - Mention that you don’t want to flash the entire system to add your new kernel or modules.
For cross compile, here is a brief addtion to the original script above, but adding the architecture and cross tools (note that cross tools will start there name with “aarch64-linux-gnu-
”, and most of the time they are in “/usr/bin
”; this is my assumption):
# This is new for cross compile; don't use it in native compile:
export ARCH=arm64
# Assuming cross tools are as in my earlier note:
export CROSS_COMPILE=/usr/bin/aarch64-linux-gnu-
# Create a temporary empty output location, and set an environment variable
# to make it easier to name (use a new and empty directory):
mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`
# Verify:
echo $TEGRA_KERNEL_OUT
# We would also create a temporary empty location for installing modules to:
mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`
# Verify:
echo $TEGRA_MODULES_OUT
# Now go to the top of the kernel source, wherever that might be (it might
# differ from this, it is just an example):
cd /some/where/kernel/kernel-5.10
# For convenience:
export TOP=`pwd`
echo $TOP
# Below this is where we now use $CROSS_COMPILE and $ARCH...but
# remember how I said that there are different ways sometimes to add
# options to the compile command? CROSS_COMPILE and ARCH, when
# exported as an environment variable, are examined by the config code,
# so those will be used in what follows based on those earlier settings
# of ARCH and CROSS_COMPILE.
# This assumes the directory of output is empty of previous configuration:
make O=$TEGRA_KERNEL_OUT tegra_defconfig
# Now is the part where we edit the configuration with a configuration editor,
# which understands dependencies:
make O=$TEGRA_KERNEL_OUT nconfig
# Note: I use nconfig, many people use menuconfig; they are the same, except
# nconfig has a symbol search. CONFIG_NET_CLS_BPF is an example of a symbol.
# Search understands the leading "CONFIG_", but you could leave that in, and
# search for either "CONFIG_NET_CLS_BPF" or search for "NET_CLS_BPF".
# Actually, it is case insensitive on search, so you could also use lower case.
# For edit, you might need the package "libncurses5-dev" first; in that case:
sudo apt-get install libncurses5-dev
# If you use the "m" key to set a symbol as enabled, then it builds as a module.
# If you use the "y" key to set a symbol as enabled, then it alters the kernel
# Image file itself. You probably should stick to "=m" if it is available. If the feature
# cannot be a module, then the config editor will do nothing when using the
# "m" key; similar, if an integrated feature is not available, then the "y" key will
# do nothing. Try "m" first. If all features can be "m", it will simplify life. If not,
# then you might just use "y" on your additional features since you have to build
# modules and Image for that case.
# This is only needed to propagate configuration if we are not building Image,
# the main kernel, but other than taking time it also does not hurt even if we
# build Image; I only do this if not building Image:
make O=$TEGRA_KERNEL_OUT modules_prepare
# Now to build the main kernel Image of all of the integrated (non-module) features.
# Note that I am assuming your build platform has 12 CPU cores. Adjust this for the
# number of cores for a faster build (but using more RAM) on whatever platform
# you choose. This is the "-j 12" in what follows. For 6 cores, you would use "-j 6".
# You would skip building Image if you have only modules added. Performed from
# $TOP.
make O=$TEGRA_KERNEL_OUT -j 12 Image
# Now to build modules:
make O=$TEGRA_KERNEL_OUT -j 12 modules
# Now to place modules in an empty directory which mirrors how install will
# place them:
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install
# Image is already in the $TEGRA_KERNEL_OUT. Example:
cd $TEGRA_KERNEL_OUT
find . -name Image
Just to reiterate, official docs tell you have to replace the kernel and modules in the flash software, and how to flash to get the new content. There are simple ways to add content without flashing. The cross compile official docs are good and easy to follow, but you probably want to ask under the Orin NX forum about install steps after mentioning what you changed in the kernel.
All of this seems like a lot of detail, but the gist is that if you’ve only added modules, then you copy the modules to the correct place and you’re done (you’d take steps like reboot or use of depmod
/modprobe
to manually load modules). If you’ve done this once, and have your kernel source in place, it becomes trivial to add a driver.
Amazing writeup! Thank you again, @linuxdev =)
Finally getting around to trying this. Got hit with some errors.
dudo@jetson1:/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10$ make O=$TEGRA_KERNEL_OUT tegra_defconfig
make[1]: Entering directory '/home/dudo/kernel'
***
*** The source tree is not clean, please run 'make mrproper'
*** in /usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10
***
make[1]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/Makefile:577: outputmakefile] Error 1
make[1]: Leaving directory '/home/dudo/kernel'
make: *** [Makefile:213: __sub-make] Error 2
So I did what it asked…
dudo@jetson1:/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10$ sudo make mrproper
CLEAN arch/arm64/kernel/vdso
CLEAN scripts/basic
CLEAN scripts/dtc
CLEAN scripts/genksyms
CLEAN scripts/kconfig
CLEAN scripts/mod
CLEAN scripts/selinux/genheaders
CLEAN scripts/selinux/mdp
CLEAN scripts
CLEAN include/config include/generated arch/arm64/include/generated .config .config.old Module.symvers
Which… seems to have deleted my .config file? Feels like that was a mistake…
Upon running make
again, I seem to be missing some drivers.
dudo@jetson1:/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10$ make O=$TEGRA_KERNEL_OUT tegra_defconfig
make[1]: Entering directory '/home/dudo/kernel'
GEN Makefile
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
LEX scripts/kconfig/lexer.lex.c
YACC scripts/kconfig/parser.tab.[ch]
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
drivers/video/Kconfig:27: can't open file "drivers/video/tegra/Kconfig"
make[2]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/scripts/kconfig/Makefile:89: tegra_defconfig] Error 1
make[1]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/Makefile:633: tegra_defconfig] Error 2
make[1]: Leaving directory '/home/dudo/kernel'
make: *** [Makefile:213: __sub-make] Error 2
I found some references to source_sync.sh
on the forums, which I downloaded, but I’m not sure where to run it, or if I need to pass a specific tag. Am I on the right track, or should I just cherry pick drivers from Driver Package (BSP) Sources? I just dug through those files, and kernel_src/kernel/kernel-5.10/drivers/video
doesn’t have a tegra dir? There’s a Makefile
with bj-y += tegra/
, but this is the first time I’ve seen a fragment like this, and I’m not sure how to use it.
Any help is appreciated. Thank you!
Do you have the full source at that location, and not just headers? If so, then you can run “sudo make mrproper
” as a starting point, and then, when used, add the compile using “O=$TEGRA_KERNEL_OUT
” (which points somewhere not owned by root). The goal is to keep the source tree itself pristine in most cases. There are some exceptions though when you are using this from outside source code (outside source code for modules needs to build against a kernel which is configured to match the running system; when treated as just headers things can be different than when treated as full source).
I’m not sure how to know if it’s full source vs headers. I originally flashed the Jetson via these commands from a ubuntu 20.04 box.
cd ~
tar xpf Downloads/Jetson_Linux_R35.4.1_aarch64.tbz2
sudo tar xpf Downloads/Tegra_Linux_Sample-Root-Filesystem_R35.4.1_aarch64.tbz2 -C Linux_for_Tegra/rootfs/
sed -i 's/cvb_eeprom_read_size = <0x100>/cvb_eeprom_read_size = <0x0>/g' Linux_for_Tegra/bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3767-0000.dts
cd Linux_for_Tegra/
sudo ./apply_binaries.sh
sudo ./tools/l4t_flash_prerequisites.sh
sudo ./tools/l4t_create_default_user.sh -u dudo -p topsecretpasswd -a -n jetson1
sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 -c tools/kernel_flash/flash_l4t_external.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi.xml" --showlogs --network usb0 jetson-orin-nano-devkit internal
Well, technically, I installed 35.3.1, and updated to 35.4.1 in place via this.
Anyway, I ran what you said beforehand, from $TOP
sudo make mrproper
sudo make O=$TEGRA_KERNEL_OUT tegra_defconfig
Which errors out with
drivers/video/Kconfig:27: can't open file "drivers/video/tegra/Kconfig"
make[2]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/scripts/kconfig/Makefile:89: tegra_defconfig] Error 1
make[1]: *** [/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10/Makefile:633: tegra_defconfig] Error 2
make[1]: Leaving directory '/home/dudo/kernel'
make: *** [Makefile:213: __sub-make] Error 2
The reason I mentioned full headers is because this is normally (but not always) the case when I see:
The source tree is not clean, please run 'make mrproper'
Headers can also be configured, but most of the time configuration only occurs in full source. The mrproper
comment is still valid.
The manual unpacking and install should work. This is provided that (A) the original update from R35.3.1 to R35.4.1 worked correctly, and (B) any source code for the kernel and installer commands are also from R35.4.1 (and I see the installer is itself from R35.4.1). The only part I don’t know yet, but which might have an effect, is where you downloaded the kernel source from? It should be R35.4.1 kernel source downloaded from NVIDIA.
This latter point is mentioned because I don’t see where you downloaded or unpacked the source, but even so, I do see a configuration issue. Plus, the error you found here:
drivers/video/Kconfig:27: can't open file "drivers/video/tegra/Kconfig"
…basically suggests that the source is not from NVIDIA. Where did you get the content in “$TOP
”? Is this cross compile or native compile (you are installing from a desktop host PC, but I don’t know where the compile is from; perhaps it is from the same host PC)?
Note: Here is the location to find source against your R35.4.1:
https://developer.nvidia.com/linux-tegra
Source is from the drivers section of Jetson Linux 35.3.1 | NVIDIA Developer
As for compilation, It was done on a 20.04 ubuntu box (not VM) with the Jetson connected via USB.
I don’t know what difference there is between the 35.3.1 and 35.4.1 kernel source, but there are typically a lot of kernel fixes between releases for Orin in recent times, so I suggest using R35.4.1 source.
Is the cillium from that source? It shouldn’t be using “/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64
” if it is directly from the kernel source in a full source install. I could see a mistake occurring in cross-compile if cross tools were not used. Are you exporting “ARCH
”? If so, is it “ARCH=arm64
”? What is your “CROSS_COMPILE
” value? What do you see from:
${CROSS_COMPILE}gcc --version
ARCH
and CROSS_COMPILE
are unset. There is no cilium yet - not installing it until the flags are enabled.
I was running these commands on the Orin itself, so not following the cross-compile steps you gave. I’m kind of confused there. I suppose I installed it as cross-compile originally, since it was an x86_64 ubuntu box loading code onto an arm64 ubuntu Orin box? Does that matter for how I should run things directly on the Orin now?
~$ ${CROSS_COMPILE}gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
If you do not cross compile, then you are correct to not set those. Sometimes people use ARCH
when natively compiling, and this would be incorrect and cause a failure. However, if you have the full source installed natively on the Jetson, then it should never try to use the separate header location (during install it might write a symbolic link pointing there, but it would not read, nor modify, “/usr/src
”).
If you copied source from a host PC to the Jetson, and there was some configuration left over, then this would cause such an issue. Can you try again after:
- Unpacking R35.4.1 source to the Jetson (you can unpack the package containing the package on a host PC, but leave the actual source as a single archive before copy to the Jetson; or just do that on the Jetson, but it takes a lot of disk space).
- During unpack do so as root (use
sudo
). As root (sudo
), at the “$TOP
” of the kernel source, run “sudo make mrproper
”. - Do not set
ARCH
, and do not setCROSS_COMPILE
. - Try the build as a non-root (regular) user with “
O=/some/where
” in all steps (and that location is writable by your regular user; this is “$TEGRA_KERNEL_OUT
”, which would be named in build lines as “O=$TEGRA_KERNEL_OUT
”). - The first actual build command would be (if
$TEGRA_KERNEL_OUT
is set; nosudo
):
make O=$TEGRA_KERNEL_OUT tegra_defconfig
If the above fails, then create a log:
make O=$TEGRA_KERNEL_OUT tegra_defconfig 2>&1 | tee log_defconfig.txt
Post that log if it fails.
It worked!! I struggled a bit on how to get the source code, then a bit more on where to put it. I tried replacing files ad hoc for a few attempts until I made the connection that I could (and should) copy in the entire source. mrproper
(I can’t help but think of a cheap rip off of drpepper every time I say that, lol) stopped working once I updated the the source. Something about phoenix source or something? My terminal history doesn’t go back that far after making image/modules.
This is what ended up working for me.
mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`
mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`
export TOP=/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10
cd ~
vim source_sync.sh
chmod +x source_sync.sh
./source_sync.sh -h
./source_sync.sh -k -t jetson_35.4.1
sudo cp -r ~/sources/kernel/* /usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64
cd $TOP
make O=$TEGRA_KERNEL_OUT tegra_defconfig
make O=$TEGRA_KERNEL_OUT nconfig
make O=$TEGRA_KERNEL_OUT -j 8 Image
make O=$TEGRA_KERNEL_OUT -j 8 modules
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install
nconfig
is not for the faint of heart… I see what you mean, though, the symbol search is necessary. Took me an embarrassing amount of time to figure out space bar cycled the values, instead of pressing Y or M.
I posted over in the other forum, per your suggestion, and was told to rtfm. Please send me your tip jar info, @linuxdev. You’re actually giving support, and I appreciate you.
I did some googling, and this looks compelling. This seems to be less Jetson specific, and more just u-boot, eh?
Could you help me get over the finish line, here?
source_sync.sh (11.0 KB)
The nice thing is that once you’ve done this the first time, you have the source, and the next time everything is less confusing. I would not mind seeing nconfig
get an improved symbol search that can navigate to that symbol.
I’m not sure which part this is for:
Do you mean with kernel install? If so, then this changes depending on a lot of things. Is this an Orin NX dev kit? Or third party carrier board from RidgeRun? If from NVIDIA, having an eMMC model changes compared to SD card model; if the carrier board differs, then some other details might differ. Incidentally, there are also differences between L4T R32.x and L4T R35.x+ (see “head -n 1 /etc/nv_tegra_release
” to find the L4T release) due to migrating to UEFI boot (which might make changing the “/boot/extlinux/extlinux.conf
” change).
Some details which might or might not matter: eMMC models have content equivalent to the BIOS, plus the boot content, in partitions. SD card models have that same content in QSPI memory on the module itself since there is no eMMC (there might be some QSPI on some eMMC models as well, and I’m not clear on what that is used for when present on eMMC models). When you flash as a whole, you are also flashing bootloader and BIOS (the equivalent). Changing something related to boot requires knowing details about the model and the L4T release.
The gist is that you can usually leave the original kernel Image
in place, and add a second boot entry if you are adding a new Image
. That makes testing safer. On the other hand, if you are only adding a module, then it is trivial since it is just a mostly risk-free file copy. If you only modified by adding a module, then you are almost done. If you modified the kernel Image
(which is what loads modules), then you’ll need to install that and all new modules. For the former case of no Image
change, you want your CONFIG_LOCALVERSION
to remain unchanged, but in the latter case of a change to the Image
, you want a new CONFIG_LOCALVERSION
if you are going to add an alternate boot entry.
CONFIG_LOCALVERSION
is appended to the source code version when you run the command “uname -r
”. “uname -r
” is part of the search path the Image
uses to find modules. Modules are in a subdirectory of:
/lib/modules/$(uname -r)/kernel
The default CONFIG_LOCALVERSION
is “-tegra
”. If your build is just to add a module, then this is what you use:
CONFIG_LOCALVERSION="-tegra"
I will point out that in the “.config
” file this is one of the few parameters which have no dependency. You could symbol search and set “-tegra
” with nconfig
, or you could directly edit the .config
. I will warn you though that if your .config
is not exactly what you want, and you edit this, that you should completely delete your temporary output location and start from scratch.
If you are going to modify the Image
itself (change of a symbol with the “=y
” instead of “=m
”), things get more complicated. You’d want to build with a CONFIG_LOCALVERSION
using something custom. An example might be any of:
CONFIG_LOCALVERSION=-cilium
CONFIG_LOCALVERSION=-tegra-modified
CONFIG_LOCALVERSION=-custom
(anything different)
Then you might rename the kernel Image
as something related, e.g., one of these examples:
Image-cilium
Image-tegra-modified
Image-custom
That way you keep the original entry. But I don’t know:
- What your hardware is (SD card dev kit versus eMMC module).
- I don’t know the L4T release (“
head -n 1 /etc/nv_tegra_release
”). - I don’t know if this is a third party carrier board.
- And I especially don’t know if you modified the
Image
or just added a module.
Incidentally, on eMMC models there are multiple possible sources for the kernel Image
. One is the file pointed at in extlinux.conf
, and another is in a kernel partition. One is added via flash, the other is just a file copy (I shouldn’t say “just” because it is a lot of file copies for Image
). The partition version is signed during a flash, and this works even when security fuses have been burned. A separate Image
file takes precedence unless security fuses are burned (at which point much of the content only works from signed partitions).
Incidentally, U-Boot has not been used for some time. Its last use was in the earlier releases of L4T R32.x, but this then was removed, and CBoot had U-Boot features merged directly into CBoot (or at least a subset of features). In R34.x+ all boot content was migrated to a UEFI boot (which is a big help for the future I think).
Let me know some details of what you need to accomplish and I’ll try to help.
Sadly, I have no tip jar. I do have an emoji jar! 😁
DM me your venmo or something. I owe you a drink! 🍻
I think I only have the kernel install left to do. Replacing /boot/Image
seems too easy, though… I have a $TEGRA_KERNEL_OUT directory with a new Image ready to go.
To answer your questions:
- This is an Orin NX 16gb with everything installed on an NVMe SSD (via this guide)
R35 (release), REVISION: 4.1, GCID: 33958178, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 1 19:57:35 UTC 2023
- Third party Turing Pi 2 carrier board
- I changed 4 features to Y, and 1 to M, so the
Image
was/is modified
This would be an eMMC model, but a custom device tree. Either way, I must warn you that I have not tested an external drive install, so I am guessing about parts of this. There are a lot of unknowns I’m guessing on, so I could end up confusing you more than helping. I can describe some concepts, though I fear I’ll just confuse you.
I’ll simplify and state part of this ahead of time: You’ve changed the “=y
” features, so you need to install both kernel Image
and all modules. You’ll want a new CONFIG_LOCALVERSION
, resulting in a new “uname -r
” and module search path (“/lib/modules/$(uname -r)/kernel
”). The difficult choice will be between overwriting the existing kernel (modules won’t overwrite because the search path will have changed) or adding a new boot entry (the latter is preferred).
The third party carrier board makes my help even less relevant since the tools I work with are all the NVIDIA tools, and third parties tend to modify flash tools. I’ll try to describe choices you might be working with, but I don’t think I can do a very good job for this particular hardware. I’m going to guess that you’ll need to start a new thread on the install part, and maybe ask the third party manufacturer (especially with regard to adding new kernel boot entries without removing the original), but hopefully this gives a starting point. Everything which follows is more or less random background, and not direct instructions (which I have no way to provide).
eMMC model Jetsons tend to have a requirement that flash provides some initial content which brings up clocks and power rails (the part which is equivalent to a BIOS). Eventually it points to the “/boot
” content of the eMMC, at least most of the time. Then the extlinux.conf
of the rootfs partition tends to name its own partition for rootfs in a direct load. Alternatively, that extlinux.conf
can name another rootfs. If your kernel Image
(preferably renamed after the new CONFIG_LOCALVERSION
) is on the eMMC, then not much needs to be modified. You’d also typically install all of the new modules (because of the new CONFIG_LOCALVERSION
, which in turn is because you don’t want to reuse modules after a modification of the “=y
” options) to “/lib/modules/$(uname -r)/kernel
”.
One complication to the above is that if you use an initial ramdisk (initrd), then typically a subset of the modules are also added to the initrd since the initrd is usually an adapter for loading the next filesystem. If the kernel has all content needed to load that filesystem, then really no modules are needed in the initrd (and in fact the initrd probably isn’t needed). Examples of cases which might need a subset of modules in the initrd: A logical volume manager; encryption; a filesystem type not directly supported in the Image
. It is possible that you are using an initrd to load the NVMe. Not due to drivers, but purely as a way to abstract the pivot_root (you will tend to see references to “Linux_for_Tegra/tools/kernel_flash/l4t_initrd_flash.sh
” when working with external media). When you don’t have an actual BIOS an initrd can take over part of the optional behavior.
That chain might mean that the Image
, using a modified name (I am assuming adding a new boot entry and not replacing the old one; this is less complicated if you don’t care about saving the old Image
, but if something goes wrong and you have the old image, then it serves to rescue without reflashing; modules don’t generally have that risk). Let’s say you’ve used “CONFIG_LOCALVERSION=-cilium
”, and now you “uname -r
” will be “5.10.120-cilium
”; then you might name the kernel “Image-cilium
”. Your modules will need to be located at “/lib/modules/5.10.120-cilium/kernel
”. But on which device? A checklist follows.
It does not hurt to install to too many devices. This can in fact work out well in the future if some device fails and you end up with the ability to boot to a different device, or just want a reference copy. Here are some places where the Image-cilium
might be needed:
- The eMMC “
/boot
” - The NVMe “
/boot
”
(note that theImage
does not need to be added to an initrd)
Here are places where modules might be needed:
- The eMMC “
/lib/modules/5.10.120-cilium/kernel
”. - The NVMe “
/lib/modules/5.10.120-cilium/kernel
”. - The initrd “
/lib/modules/5.10.120-cilium/kernel
”.
(this last one is new; the subset is determined by what is needed to reach the next filesystem, and in your case, it probably won’t be a requirement)
Here are places where extlinux.conf
might need to be modified (because I’m assuming adding a boot entry, and not replacing existing content; you could just name this “Image
” and install modules at required locations and skip any extlinux.conf
edits):
- The eMMC “
/boot/extlinux/extlinux.conf
”. - The NVMe “
/boot/extlinux/extlinux.conf
”. - The initrd “
/boot/extlinux/extlinux.conf
”
(or, indirectly, the initrd might edit the nextextlinux.conf
in the boot chain)
As to which location actually requires content, the Image
(or Image-cilium
) must be on the first “/boot
” read in the chain. It is advisable for that same Image
to be on the final device, but once the Image
is loaded, no other location needs the Image
. The same is true for device tree, but normally only the first device reads the device tree (it is still advisable to add the tree to the final media).
My advice is to ask the manufacturer how to add a second boot entry (in addition to the original) for the NVMe work flow. Alternately, you’ll need to ask (for your model) how to add kernel Image
and modules for the NVMe case on their carrier board.
I’ve successfully tested out the new image/modules!!
I just re-read your original answer, and everything is so much more clear now 😅. Thank you so much for the overabundance of information! I’ve learned a ton about linux during this process.
I started from scratch, kinda, after being pointed to Kernel Customization. It seems as though that’s geared towards external flashing, but what stood out to me was nvbuild.sh
. I don’t think I need steps 5-7? I modified nvbuild.sh
to run nconfig, append a suffix to LOCALVERSION, and to run modules_install.
I put together the steps I followed in this Gist.
I discovered kexec
when asking for help back on the turing pi discord, and after a bit of struggling (forgot to update /lib/modules), I finally got it to boot!
$ uname -a
Linux jetson1 5.10.120-tegra-cilium #1 SMP PREEMPT Mon Aug 28 01:25:02 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
I’m not sure which are the steps 5 through 7? If you mean the dtbs
target during kernel build, then that is not needed unless you are altering some non-plug-n-play device.
The modules not being updated or being in the wrong place has a high rate of occurrence in kernel install issues (not just on Jetsons). The relationship between “uname -r
”, module search path, and CONFIG_LOCALVERSION
is such a small thing, but it is a critical part which doesn’t “seem” to look important.
Glad it works for you now though.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.