Custom kernel build on Jetson AGX Xavier

Hello,
I am trying to build the custom kernel on Jetson AGX Xavier.
I am following the steps given by NVIDIA
(https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/kernel_custom.html#wwpID0E0CE0HA)

  1. I have manually downloaded the L4T Driver Package (BSP) Sources

  2. Then, followed steps under “Building the NVIDIA Kernel”.
    2.1 I followed step1 $ TEGRA_KERNEL_OUT=<outdir>
    2.2 I have downloaded the L4T toolchain and set $ export CROSS_COMPILE=<cross_prefix>
    $ export LOCALVERSION=-tegra

    2.3 I executed $ cd <kernel_source>
    $ mkdir -p $TEGRA_KERNEL_OUT
    $ make ARCH=arm64 O=$TEGRA_KERNEL_OUT tegra_defconfig
    2.4 I wanted to include quectel drivers, so I executed make menuconfig to change the kernel
    configuration and saved
    2.2 While performing the 4th step that is “make ARCH=arm64 O=$TEGRA_KERNEL_OUT -j” I am
    getting the below error.

Can someone suggest how to fix this?

Are you compiling natively on the AGX? If so, then do not use cross-compile directive “ARCH=arm64”.

Hello,

Thank you for the response.
I was able to build the kernel image. I copied the new kernel image into the /boot folder and tried to reboot the system. It is taking more than 7min along with the message displayed “task nvpmodel:5984 blocked for more than 120 seconds”.

I won’t guarantee it, but it is highly likely that the kernel’s configuration was incorrect.

Something I always suggest is to be able to boot with serial console so that you can interrupt boot when choosing kernels, and to pick an alternate entry for your test case kernel. Then, if it fails like this, you can simply pick to boot the original kernel, and fixing is a simple file copy. As is you may need to flash again since the original kernel appears to be “replaced” instead of adding a second alternate boot entry.

When you did place this content in “/boot”, what was the exact operation? Did you copy the new “Image” to “/boot/Image”? Or was there some other step? Much of what I see you’ve done is correct, but the details are scattered in such a way that I can’t guarantee something wasn’t quite right.

The steps I performed are,

  1. racecar@Jetson:~/Downloads/kernel_src/kernel/kernel-4.9$ make menuconfig
  2. racecar@Jetson:~/Downloads/kernel_src/kernel/kernel-4.9$ make
  3. racecar@Jetson:~/Downloads/kernel_src/kernel/kernel-4.9$ sudo make modules_install install
  4. Then I took a backup of original Image present in /boot/Image and removed it from /boot
  5. Then I copied a newly created Image at /Downloads/kernel_src/kernel/kernel- 4.9/arch/arm64/boot/Image to /boot
  6. And tried to reboot

Also, to include Gobinet and QMI_WWAN driver, I have modified the /Downloads/kernel_src/kernel/kernel-4.9/drivers/usb/serial/option.c file according to Quectel’s “LTE&5G Linux USB Driver User Guide”.

In the /boot Image.sig file was also present already, I am not sure whether that will have any impact.
Did I miss anything?

Step 1 is technically correct, except that it would have had to have been step 2. You are missing this:

racecar@Jetson:~/Downloads/kernel_src/kernel/kernel-4.9$ make LOCALVERSION="-tegra" tegra_defconfig

Please note that if you are building right in the source, that you should also run “make mrproper” to completely clean out any old configuration prior to starting. Targets like “tegra_defconfig” expect a clean starting source.

Don’t know about the edits, this might be ok, but unless initial configuration is correct, then there is no way for the build to work.

Note that “CONFIG_LOCALVERSION” assumes you want the “uname -r” to be the same as the old kernel’s “uname -r”, and thus you are reusing modules. However, for the purpose of testing, you might want to separate original content from experimental content, so you might alter this to something like “-testing” (would result in a new module directory).

The signed image only matters if you are placing this in a partition and do not use the “/boot” content. Mostly you can ignore this unless you have burned security fuses (in which case “/boot” content will be refused, and only signed partition content will be allowed).

ok so currently I am planning to reboot Jetson AGX Xavier through external SSD. But one thing I did not understand is while running which command do I have to use -testing ?

after custom kernel build, Jetson AGX Xavier is not booting.

I have external hard-disk which has the original kernel Image backup.

How can I recover operating system with the original kernel?

The CONFIG_LOCALVERSION is appended to the response from the command “uname -r”. The base kernel release version is the prefix. As an example, if the kernel source is version 4.9.140, and then during build CONFIG_LOCALVERSION is “-tegra”, this causes “uname -r” to respond as “4.9.140-tegra”.

The location where the kernel searches for loadable modules (possibly including critical drivers) is based on the output of “uname -r”. A kernel will always search for modules at:
/lib/modules/$(uname -r)/kernel

If the modules are there, then the kernel is happy and will load what it needs. If the modules don’t exist there, then everything in module format will fail. The modules at such a location, when “uname -r” is valid, must also be compatible with that kernel (for example you couldn’t mix modules compiled from a different release into that directory).

Whenever you replace the “/boot/Image” you are replacing the kernel. Whenever you add or remove files from “/lib/modules/$(uname -r)/kernel”, then the original kernel will use whatever it finds there. If you don’t replace the Image file, then all you need to do is add files to “/lib/modules/$(uname -r)/kernel” (in the correct subdirectory), and it should work. If you replace “/boot/Image”, and don’t maintain the same “uname -r”, then you must install 100% of all drivers into the correct location.

Quite often the system will boot just fine if some modules cannot be found (you’d see errors in boot logs, but it wouldn’t be unusual for serial console or other login methods to still work). Quite often if the Image file is not correct, then the system will fail to boot.

When you flash an AGX the arguments passed to the flash software can change where the AGX looks for the Image file. Even if an SSD is used for rootfs, it might be that the “/boot” content is still read from the eMMC’s “/boot”. There is a pointer in the boot content which can change which “/boot” content is used.

If you overwrote the “/boot/Image” file, and did not take steps to back this up or keep an “original” boot entry, then you’d have to reflash the system. You could clone, update just the Image file, and then reflash with the fixed clone (takes a lot of time and disk space, but is reliable).

I recommend that anyone updating the kernel Image to not replace “/boot/Image”, but to instead create an alternate entry, e.g., name it “/boot/Image-testing”, add a second boot entry in “/boot/extlinux/extlinux.conf”, and then pick the alternate during boot using serial console. This way the original is still there if something goes wrong, and all you have to do to fix it is boot to the original entry.

A typical clone will need the AGX in recovery mode, and on the Linux host PC you’d want at least 50GB of spare space, and then run this command (takes a long time) from the “Linux_for_Tegra/” directory:
sudo ./flash.sh -r -k APP -G my_backup.img jetson-agx-xavier-devkit

This would produce a “sparse” file, “my_backup.img”, and a “raw” file, “my_backup.img.raw”. This latter raw file can be loopback mounted (e.g., “mount -o loop my_backup.img.raw /mnt”), edited (such as replacing the Image file), or used to flash with (flashing with the “-r” option to reuse the partition image, when “my_backup.img.raw” is copied to “Linux_for_Tegra/bootloader/system.img”), and thus restore your original system. Just make sure the PC’s flash software is the same release which original created that rootfs.

Hello,

When I tried to reboot the jetson.
It stops and displays

‘A stop job is running for nvpmodel service (3min 16sec/4min 30s)’

what is the issue? did anyone face the same issue?

I couldn’t say for sure, but know that the nvpmodel service depends on kernel features being exposed in the “/sys” files (which are not real files, they are kernel drivers pretending to be files as a method of interface to the user space world). It is possible that you left out a configuration in your new kernel (which is why it is important to start your kernel build with an exact match to the default kernel before making any adjustments).

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.