Compile & install customized Kernel directly on Orin Developer Kit

I’m trying to follow the guide here Kernel Customization — Jetson Linux Developer Guide documentation (nvidia.com) to build a new Kernel (in my case with CONFIG_PCIE_PTM enabled).
If I understand the nvbuild.sh script right, it always recreates the .config file in the Kernel build output directory which I specify with the o= parameter.
If that is the case, what is the suggested way to change Kernel build parameters and to build, install & boot the new Kernel then directly on the Orin?
I tried splitting up the nvbuild.sh script such that I first create the default .config using the command from the script, then I edit the Kernel build parameter by issuing the make nconfig command, and then I run the rest of the commands found in nvbuild.sh.

This basically seems to work, but I fear I am missing something since my Display (attached to the DP connector) always stays off, even though the system boots the new Kernel and I can log in via serial console.

After I have unpacked the source code to ~/Linux_for_Tegra/source/public
and created the kernel build output directory: ~/Linux_for_Tegra/source/public/kernel_out

in principle I do:

  1. Use only the part inside nvbuild.sh to create the default Kernel config in ther kernel_out directory.
  2. Edit the generated default config using …
    make -C kernel/kernel-5.10/ O=$PWD/kernel_out/ nconfig
    The only parameter I changed was: CONFIG_PCIE_PTM=y
  3. Use the other parts from nvbuild.sh (without creating the default .config)
  4. Copy newly built Kernel to /boot/Image
  5. Copy newly built nvgpu.ko to /usr/lib/modules/5.10.104-tegra/kernel/drivers/gpu/nvgpu
  6. Copy everything from kernel_out/arch/arm64/boot/dts/nvidia/ to /boot/dtb/
  7. reboot

Result:

  • No signal to external screen attached to DisplayPort connector
  • Tons of error messages on the serial console like:
[   22.303878] nvidia: Unknown symbol pci_disable_device (err -22)
[   22.310288] nvidia: disagrees about version of symbol pci_stop_and_remove_bus_device
[   22.318321] nvidia: Unknown symbol pci_stop_and_remove_bus_device (err -22)
[   22.325553] nvidia: disagrees about version of symbol pci_read_config_byte
[   22.332671] nvidia: Unknown symbol pci_read_config_byte (err -22)
[   22.339018] nvidia: disagrees about version of symbol pci_write_config_word
[   22.346211] nvidia: Unknown symbol pci_write_config_word (err -22)
1 Like

I don’t know much about the build script, but I’ll comment on some of this.

  • Native compile works well, although you could run out of space if not careful.
  • Your basic build command seems like it should work. However, there are a few details you might have missed (or maybe not):
    • Before you add your edit, you should first configure to match the existing system. If this is a default system, then the “tegra_defconfig” target is good (you’d use the “O=/some/where” with that too).
    • If you are reusing the integrated kernel Image, and only adding a module, then you’d also want to set “CONFIG_LOCALVERSION=-tegra”.
    • If you are not reusing the integrated kernel Image file, and instead replacing it, then you’d want a different CONFIG_LOCALVERSION. Example “CONFIG_LOCALVERSION=-pci-ptm”.
    • It is good to always output to an alternate location (such as what you are doing with the kernel_out/ subdirectory). If the original source code is to be used, then before anything else, within the existing kernel source, you probably want it to be pristine (so only changes in the alternate location exist). You can do that like this (notice there is noO=/some/where”):
      sudo make mrproper
  • Hint: I used sudo for mrproper because I like the source code itself to be owned by root, and I compile as non-root. This means there cannot be accidental configuration issues between two different builds.
  • Hint as exception for hint: Sometimes people replace the linux headers download with the actual source code. In that case the above is still mostly true, but after the mrproper the root-owned source would need to be set to the configuration of the existing system.
  • When you have a given kernel version, and you set CONFIG_LOCALVERSION to something like “-test”, then modules are searched for here (this isn’t your kernel source release version, but it is a contrived example):
    • Default config modules at: /lib/modules/4.9.140-tegra/kernel
    • Changing to CONFIG_LOCALVERSION=-test: /lib/modules/4.9.140-test/kernel
  • Be sure that when you add or alter features you use a configuration editor, and normally you would not directly edit the .config file (you can directly edit CONFIG_LOCALVERSION because it has no dependencies).

When the only thing you do is add a module, and it is configured against the existing kernel config, you just copy the module to the right place under “/lib/modules/$(uname -r)/kernel”. When you alter the base kernel it implies you need to replace the kernel itself, and all modules, using a new CONFIG_LOCALVERSION. It is better to leave the old kernel there for backup and rename something (either rename Image to something like Image-original, or better yet, rename the new kernel to something like Image-new). This is a bit muddy as to procedure in the L4T R35.x series due to how the initrd is set up. Apparently the package nvidia-kernel can modify changes you make manually to the extlinux.conf.

Thank you a lot for this thorough explanation @linuxdev
So, my case is that I actually have to change something in the Kernel (as said, set CONFIG_PCIE_PTM=y in the .config). This means I will have a newly built Kernel Image file in the end, which I will use as a second Kernel image, added to the extlinux.conf.
If I understood you (and the other documentation so far) correct, for this new Kernel Image I should use a dedicated CONFIG_LOCALVERSION string.
This will result in another modules-location under /lib/modules/.

What about all the modules then? Will they be built automatically if I just issue the make command for the Kernel, or do I have to issue some subsequent make commands after the image was built to built and install the modules (I’ve read something about issuing the make command with parameters like modules_prepare, modules etc.).
Can you/anyone help with these commands?

Questions:

  • How did you set “CONFIG_PCIE_PTM=y”?
  • Did you use a menu editor?
  • Before doing this, did you copy the existing configuration, or perhaps run the “tegra_defconfig” target?
  • Did you set CONFIG_LOCALVERSION?
  • Did you also install all modules, and if so, where to?

I used a menu editor (once menuconfig and on later tries nconfig) to set CONFIG_PCIE_PTM=y. I made a diff however between the original .config and the changed config and only that option was changed by the menu editor.

I don’t know what I did lastly, but I tried both ways already: copying the existing /proc/config.gz (after unpacking of course) as well as create a tegra_defconfig.

I think I did set CONFIG_LOCALVERSION to -tegra (I’m not at the Jetson Orin currently to verify)

No, I did not install all modules. That’s exactly what I’m wondering about: How do I install the modules? Can you give me a hint on the 1, 2, 3 commands I need after actual Kernel compilation? Thanks!

There is a lot that could be said about “good practices”. In the case where you added a feature which is not in the form of a module, this might invalidate all modules against that new configuration.

Consider the output of the command “uname -r”. The prefix is the source code version, and the suffix is the string from CONFIG_LOCALVERSION. In a contrived example, if your kernel version is 5.15.0, and if your CONFIG_LOCALVERSION is set like this:
CONFIG_LOCALVERSION=-tegra
…then “uname -r” would be 5.15.0-tegra.

A kernel looks for its modules at:
/lib/modules/$(uname -r)/kernel

If your “uname -r” did not change, then it is looking for the same modules at the same location which the previous kernel used. Considering the Image file itself changed, there is a strong chance modules will have issues loading. So in that case you are advised to use a different CONFIG_LOCALVERSION (e.g., “CONFIG_LOCALVERSION=-pcie-ptm”). Then the modules would be added to the subdirectories of (adjust for your actual kernel version) “/lib/modules/5.15.0-pcie-ptm”. You could leave the old kernel in place as a backup which can load the old modules. The new kernel could be named something like Image-pci-ptm as a reminder.

When you build modules for a kernel they are in subdirectories named after the source for that specific module or driver. A contrived example is that perhaps you have built (relative to the top of the kernel source) “drivers/net/gizmo.ko”. In that case, assuming my example “uname -r”, this would end up located here:
/lib/modules/5.15.0-pcie-ptm/kernel/drivers/net/gizmo.ko

I do not recommend having the kernel build directly place everything in their location (and in fact you cannot if cross compiling), but if you use the “O=/some/where” build option, then intermediate output goes to that location (and configuration is read from that alternate location). You can combine this with the “INSTALL_MOD_PATH=/some/where/else/for/modules”, and the entire tree of only what gets installed goes into that path. You can recursively copy this into the (example) /lib/modules/5.15.0-pcie-ptm/kernel and the new kernel would find those and those would be compatible.

This is for native compile (be careful if you do this to have enough disk space), and a few other options are added for cross compile, but here is a useful cheat sheet (you can ignore the dtbs and firmware in this case):
kern_setup.sh (2.3 KB)

The target in that script for installing modules gives you a replica of the tree of files as it would go into the “/lib/modules/$(uname -r)/kernel”.

Just set that to executable, and get the notes via “./kern_setup.sh”. That example goes into a lot of separate build locations and it isn’t really needed to split it up into all of those locations, but its purpose is to illustrate.

Note that one has to start with a configuration which matches the existing kernel, and only then do you edit to change the new configuration. If your Jetson is default to start with, then that can be via the “tegra_defconfig” target. If not, then it can be from the “/proc/config.gz” (after decompress and move to name “.config”). In all cases you’d have to set the CONFIG_LOCALVERSION because this is not set via either config.gz or tegra_defconfig.

Incidentally, if there are modules which are required for boot, then those have to go in the initrd when using this (and Orins do, but many modules are not needed in that). If the module is required for reading the filesystem, and it is not integrated within the Image, then you run into a bit of the “chicken and the egg” dilemma. The initrd is how you get around that. For example, if your filesystem type is XFS, but you only have ext4 in the Image, then you’d need to load the XFS module…and if the XFS module is itself on XFS, you have a problem. The initrd is a very simple minimal root filesystem which more or less has only modules and firmware, and as its last command, it performs a pivot of the root filesystem to the actual filesystem, and starts executing init there.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.