Was this performed on a native build?
If so, then this is perhaps the reason for failure. You should never name the ARCH=arm64
on a native build. The above causes changes (not sure if it should or should not cause changes, but it does). When building natively, can you try again with a clean build which leaves out the ARCH
in all build commands?
It sounds like mostly what you are doing is correct, except that explicitly naming ARCH
when natively compiling is likely to cause a boot failure.
Do beware that if CONFIG_LOCALVERSION
differs, then modules will not be found. You would then have to build and install all modules as well. Unless you have a reason to change CONFIG_LOCALVERSION
(and thus change “uname -r
”) this should be made as a match. If indeed you really want a new “uname -r
” in order to search for modules at a new location, then instead of leaving it blank you should create a new name, e.g., “-tegra_test
” as an example.
About what follows…
Beware that most of the example which follows is unnecessary if every driver is integrated directly into the kernel Image
file, and does not exist in the form of a module. However, there are reasons why modules exist, so I would not normally consider making every feature as an Image-integrated non-module to be practical.
The explanation below is very long, but this is something worth documenting for other people looking for help on kernel module issues. Sorry, this all surrounds module loading, but there are so many questions related to this in the forums it is worth posting this. This probably goes far beyond answering your particular question, and will also likely still leave you with questions. Keep reading at your own risk of exhaustion boredom! This is designed not as a recipe for using new rootfs media, but is instead the kind of detail which can be used to debug or engineer new rootfs media (concentrating on module load requirements…official docs give recipes for alternate rootfs media types).
For your case it is “simplest” to install modules to the eMMC and simultaneously to SSD (then your kernel won’t care if any given module loads from SSD or eMMC…it would load the same exact content from either medium). Below I’ll talk about different filesystem types simply because it is a good way to demonstrate module loading issues. You’re not using an odd filesystem type, but it is easier show why and how module load (a simple topic) is greatly complicated when split into two different pieces of hardware (e.g., eMMC versus SSD, but it could just as easily be ext4
versus the XFS filesystem types).
Understanding module loading is fairly simple, but the different ways people use to mount the rootfs complicate module loading (an SSD as rootfs is one of those cases). If you don’t have an initial ramdisk (initrd
, a kind of special “utility adapter”), then at the moment the kernel loads some modules might also try to load from whatever is mounted at that instant in time (“some” because modules load dynamically upon some dependency being detected). The subtle thing about that is that not all modules will always load and some modules might load at a later time after your new partition is mounted and overriding the original module directory. If both “old” (the eMMC module directory) and “new” (the SSD) have the same content, then you won’t notice a difference. Replacing a module with an exact duplicate at another location is seamless. If you know which modules load immediately, and put those on eMMC, and then place only the later loading modules on SSD, then this too works…but then you must know which loads when.
So a question arises: When would a module load prior to the new SSD filesystem being mounted if the SSD mounts quite early? The answer is that sometimes modules are needed for the kernel to access part of the hardware, including some partitions which might have a different filesystem type. One Linux filesystem type is XFS, so imagine your eMMC is formatted for XFS…then the kernel Image
file itself would fail to load because the bootloader itself cannot understand XFS (the bootloader itself does not have an XFS driver, it only has ext4
and initrd
drivers). In what follows I’m assuming the XFS driver is in the format of a module, and not built into the actual Image
file for the Linux kernel. Ok, so make the eMMC ext4
for the “/boot
” kernel Image
location so the bootloader can read it (the bootloader understands ext4
), and thus the “/boot
” can be used to read the kernel and place it in RAM at the right location. The “/boot
” content using ext4
is pretty much mandatory unless you use an initrd
, but will mention that later. This is a limitation of the bootloader, and not a limitation of the Linux kernel.
The bootloader itself has its own filesystem drivers, and these are required if and only if the bootloader is reading from a formatted partition, e.g., ext4
. ext4
is the default and you are guaranteed the bootloader can read this filesystem type, and thus can read “/boot/Image
” on an ext4
filesystem. This is how the kernel is read unless the kernel is in a partition. If the kernel is read from a partition, then it is binary data with no underlying filesystem, and thus reading partitions by the bootloader only needs a driver for the controller, e.g., a SATA or eMMC driver (the eMMC controller driver exists within the bootloader, and some other external media drivers also exist, e.g., SATA over USB). You’re using an ext4
filesystem though, so you can ignore loading from a partition (someone who has burned security fuses must load through a signed partition, so there is a case when “/boot
” load options go away).
The bootloader never reads modules. This is performed by the kernel which is currently running. The kernel was placed in RAM and execution transfers to the kernel (the bootloader has as its one goal to overwrite itself and die by bringing the kernel to life). If the code for reading the filesystem is integrated into the kernel Image
, then the kernel never has an issue with that filesystem type. If the code for any hardware access or understanding a filesystem type is in the form of a module, then the module must be loaded prior to accessing the hardware or filesystem type. If you have a filesystem driver for reading the filesystem type of the module directory is not already loaded, then the module to read the filesystem type cannot be read. This becomes a “catch 22” or “chicken and the egg” dilemma. The module can’t be loaded because the filesystem type is not understood, and the ability to understand that filesystem type is in the module which cannot be loaded.
You could substitute “drivers for your SSD” and “drivers for eMMC” in the above. There still remains the central point of the timing of when drivers are needed versus where they are found being available (i.e., there isn’t much difference between filesystem drivers and hardware drivers when it comes to “not working because they don’t yet exist”). I’ll continue though with the filesystem type as the example.
Not all drivers are needed for boot. For example, suppose you have an audio driver for some nice 7.1 surround sound audio card. This has nothing to do with the chain of boot to Linux kernel load to ability to load modules. Such a module could exist on just the SSD if you desire.
As an example, if you were to integrate a non-module format sound driver into the Linux kernel, then the kernel size would grow significantly, but you’d never have to worry about having the driver available…one could play sounds and audio even before the rest of the operating system completes loading (all modules could be missing and the integrated driver would still work).
If you happen to be building custom audio appliances and you are selling and servicing these units yourself, then integrating directly (versus module format) would be a good tradeoff since you’d always want that content anyway (there would be no “unnecessary” bloat). Well, until there is a patch. Then you’d have to replace the entire kernel and not just a module (there is a tradeoff if your module will evolve over time…then the module and initrd
start looking more attractive versus integration). If such a driver is being made available in Linux in general, but only 1% of the users have this hardware, then 99% would have a bloated kernel for no reason. There is a similar issue that much of the embedded system hardware is custom for that particular hardware. Should the driver be made available by the mainstream Linux distribution, then almost certainly you’d only want this built as a module to create the driver as an option (the Realtek ethernet drivers are an example…they’re common, but not needed for many people). ext4
is typically not a module format because an extraordinary percentage of Linux systems load this or an initrd
at the start. ext4
is more or less the gold standard of initial filesystem types on any generic Linux install. Other filesystem types tend to be installed as a module.
Some features cannot be a module. This is usually for invasive content, e.g., virtual memory swap is such a case which cannot be a module format. Some features only exist as a module, and don’t have an integration option (which could be for any number of reasons, e.g., not being a GPL license or being experimental, or the author simply did not take the time to write compatibility in integrated format).
Here’s something which might seem to be a complication, but is possibly a simplification in your case: The initrd
. The initrd
is the initial ramdisk, and is just a compressed tar
archive which is unpacked into RAM with a very simple “tree” structure which can be treated as a filesystem. The initrd
is very streamlined and lacks many of the abilities of a real filesystem, but it has no trouble being read as files in directories. When the initrd
is loaded instead of the actual ext4
hardware device the entire filesystem is in RAM instead of on disk; this leads to one very important observation: The kernel Image
is still the file which is read from “/boot
” (if security fuses are not burned and extlinux.conf
says to use the Image
file in “/boot
”), but all of the content surrounding the kernel (including init
and kernel modules) are on on the initrd
filesystem. There is only one kernel Image
, but other content might exist in more than one place. If ext4
or XFS filesystems are not part of the kernel Image
, then those filesystems will still work if those modules are in the initrd
module location. The kernel won’t care that the filesystem is in RAM and not on a disk. Putting the absolute minimum dependencies for modules in the initrd
does the job quite well. Your case might not require an initrd
, but if something triggers the need to load a module in order to use the SSD, then the initrd
having the module is the simplest solution.
Once those modules are loaded from the initrd
(if using one) the init
process will eventually tell the kernel to load any module type drivers needed for any kind of disk (e.g., SSD or NVMe if not integrated into the Image
). It won’t matter if the driver is for SATA over USB, or SSD on an m.2 slot, or any other crazy scheme (e.g., iSCSI over gigabit which would require both ethernet and iSCSI modules to be loaded into the initrd
prior to mounting the filesystem). Should the modules be available as a module format in the initrd
the kernel is guaranteed to be able to find and load these prior to using that hardware or filesystem type requirement occurs. The kernel simply mounts the new rootfs of either the eMMC or SSD on some temporary mount point (the initrd
modules tell the kernel how to do this), and then performs a pivot_root
type operation to transfer the concept of “root of the filesystem” to this new mount point (the temporary mount point is renamed as “/
”). That new device becomes the rootfs, and the life of the initrd
goes away (RAM holding the initrd
is released). Like a bootloader the initrd
has as its only goal to eventually overwrite itself some other file system. Then your SSD becomes the rootfs and the initrd
deallocates. This is your adapter between things required to boot using modules and modules which are not yet available. This adapts module requirements between two points in time during boot (the timing of module availability is altered).
Do note that the old eMMC “/boot
” content still exists since it is on a disk. Whether or not you see that content depends on what gets mounted after the pivot_root
. If the “/etc/fstab
” of the SSD says to mount some part of the eMMC on the new pivot_root
“/boot
” (owned by SSD) mount point, then “/boot
” remains with the content of the eMMC (fstab
told mount to place the eMMC version there and to hide any SSD version of “/boot
”). On the other hand, if you just pivot_root
and never mount the eMMC on “/boot
”, then the “/boot
” content is entirely from the SSD. Should the “extlinux.conf
” of the two partitions differ, then it would have been read from the eMMC version, and changes to “extlinux.conf
” on the SSD will have no effect on boot (there is an exception which I’ll mention later, but this is the “simple” case). This is why you can have two extlinux.conf files and edits might not do what you think. It is common practice on a PC to have “/boot
” and “/
” (the rootfs) on separate partitions, but this is not used (so far as I know) on Jetsons.
Note that sometimes device tree is in “/boot
”, and that device tree content is used for each driver at the moment the driver loads as if the tree is an argument passed to the driver. However, if a device tree is read into RAM (as reflected in “/proc/device-tree
”), consider that what is loaded might differ from the tree in “/boot
” if the tree was loaded with one such tree, followed by changing the “/boot
” between that of eMMC and SSD. The “/proc/device-tree
” content is the definitive source of knowing what was actually loaded, not the one on the filesystem.
When a Jetson is flashed it has non-rootfs/non-filesystem content which is more or less a pointer to finding “extlinux.conf
” (this pointer is early boot stage binary data, and is not part of the rootfs). There are some macros which might also be associated whereby more than one extlinux.conf
location is searched, and one location has a higher priority than other locations (the U-Boot console is useful for examining those macros for models using U-Boot and not just CBoot). I’m not saying the below flash command works for all hardware, or even this hardware, but it is an illustration that the “priority” extlinux.conf
location can be changed via flash parameters:
sudo ./flash.sh jetson-tx1 mmcblk0p1
sudo ./flash.sh jetson-tx1 sda1
The above concept demonstrates how a priority device might be determined at the moment of flash. What it doesn’t make obvious is that the boot content itself (prior to Linux loading, excluding extlinux.conf
) will come from eMMC binary content (or on SD card models of Jetson this is from the QSPI memory on the module storing this “pointer” and macro content). The extlinux.conf
itself might name another device, e.g, one might load the eMMC version of extlinux.conf
, and that extlinux.conf
may have an entry pointing at sda1
and hand off to a new extlinux.conf
, or the pointer might entirely skip the eMMC extlinux.conf
and directly load the sda1
version of extlinux.conf
. It just depends on how flash was set up and the order in which boot media are accessed, plus any twist the extlinux.conf
entry adds. Thus there may be one extlinux.conf
determining the final boot content, or two such extlinux.conf
files, and the one you edit or look at might be right or wrong depending on which part of boot you are speaking of. If editing an extlinux.conf
has no effect, then perhaps you are editing the wrong one. Perhaps one has a different module requirement than the others. If this occurs, then module load media location is probably also in question.
Can you believe all of that is to say “it all depends as to whether modules need to be on the SSD or the eMMC or both”? It isn’t a recipe, but it should help the patient find out what is wrong with their alternate boot media boot process. Don’t forget that the official docs offer recipes for alternate boot media.