How to add driver module in kernel in 36.4 version?

Hello, I have a problem now. I need to add a driver module in the kernel, but I don’t see the relevant operation process in the document.

I’m trying to modify “Linux_for_Tegra/source/kernel/kernel-jammy-src/arch/arm64 / configs” path defconfig to new driver module, but finally failed. I found it in “Linux_for_Tegra/source/kernel/kernel-jammy-src” path generated in the “.config”, it generated may change and conflict with mine.

Now I need you to tell me a correct process to add a driver module in the kernel, and at the same time to ensure that my subsequent flashing will not have problems, thank you.

Start by noting the output of “uname -r” on your running system in which the existing kernel will load that module.

There will be a prefix and a suffix to that output. The prefix is the kernel source code version, and the suffix is something set at compile time via the CONFIG_LOCALVERSION config (a string). For NVIDIA, the default will be “-tegra”. Make note of that suffix, especially if it isn’t -tegra. You might mention this in any forum thread when you work on a kernel.

Start by matching the configuration of the running system. I don’t know which release you are working with (see “head -n 1 /etc/nv_tegra_release”), which you should mention in the forum thread. You must use the same source code for producing the module that was used to create the running kernel. You can get to this for the L4T release web page which the nv_tegra_release file shows. Go here to get the specific files:
https://developer.nvidia.com/linux-tegra

For reference, in an L4T R35.x or earlier release, the default target for initial configuration is tegra_defconfig. For L4T R36.x or newer the default target for initial configuration is defconfig. These are “make” targets which create a .config file. If you were to directly edit or start configuration without those initial steps, then you’re building a wildly different kernel since the configuration is undefined until you give config that starting place.

Alternatively, a running kernel on a Jetson produces this file:
/proc/config.gz
(you’d copy it somewhere else, and then use gunzip on it; then rename it as .config if you want to start with this; location to copy to is the kernel source code output location)

Except for the value of CONFIG_LOCALVERSION (that’s the “-tegra” suffix), that file is an exact match of the running kernel (it isn’t a real file, it is the kernel itself pretending to be a file and showing its configuration). Because a “make defconfig” or “make tegra_defconfig” is what NVIDIA uses to start with, followed by setting CONFIG_LOCALVERSION to “-tegra”, then technically you could start with either the default config target or you could copy the config.gz (after gunzip and rename) to get exactly the same thing. However, if your currently running kernel is modified, then only the /proc/config.gz will be accurate. Just remember you must also set CONFIG_LOCALVERSION.

Once you have that basic configuration set up you can use the make target of menuconfig or nconfig to find and update whatever it is you wish to change. Do not edit the .config or defconfig/tegra_defconfig directly since there are dependencies and changing something without tracking dependencies will break the build in many cases (you could get lucky though). I like nconfig over menuconfig due to the symbol search function (every module or driver has a symbol used to name it; CONFIG_LOCALVERSION is an example of a symbol).

Regarding CONFIG_LOCALVERSION: If you add a module with the m keystroke in the editor, and if you start with the configuration method I gave above, then your old kernel will accept this module with no need to flash or add a full kernel. As soon as CONFIG_LOCALVERSION changes, or when a feature which is integrated (using the y key to say yes to a feature is a way to add a feature, but it is not a module) changes, then you probably need to change CONFIG_LOCALVERSION and install a new kernel Image as well. If you install a new kernel Image, always leave the old one present and create a backup boot entry to this in /boot/extlinux/extlinux.conf for safety in case the new kernel fails.

Thank you for your reply. Is this method also applicable on the host side? Because I need to make a bsp package to flash a batch of jetson orin nx 16GB.

My current kernel version is 5.15.148-tegra.

Today, I tried to add the wifi driver module in this way according to you, and I did it on my linux host.

I used menuconfig to add intel driver module, kernel compilation and installation were normal, but there was a problem with burning, Error flashing qspi was reported.
Here is the detailed flashing log:
wifi_flash.log (321.5 KB)

I found these messages in the log before it produced an error:

mtd_debug: error!: open()

I used menuconfig to add the following configuration items to the default defconfig, which I added with reference to linux35.6 defconfig.


This is my “.config” file:
defconfig.txt (255.9 KB)

I don’t know if it was my misconfiguration that caused the flash failure, but I really can’t figure out why it affected qspi.

I urgently want to solve this problem, because it has been bothering me for many days. I hope you can tell me the correct wifi driver configuration item, and I also hope you can tell me why the current configuration will cause qspi errors. Thank you for your trouble.

FYI, host side cross compile is the same, except you have to set up the export ARCH=arm64, and name the cross tool chain (usually “export CROSS_COMPILE=/usr/bin/aarch64-linux-gnu-”). If used or mentioned, the “/proc/config.gz” of course comes from a running Jetson. The real difference is whether or not you are going to apply this directly to the Jetson via flash, versus installing only modules, versus install modules and the kernel Image; note that if you build Image, then you almost always have to also build every module and install every module in a new location with a new CONFIG_LOCALVERSION, whereas if you just install new modules, and if those modules matched the existing running kernel config at the start, then it is a simple file copy.

Regardless of whether you are flashing this or directly copying it, it is usually useful to test the new modules and/or new Image plus modules before putting it in the flash content.

The screenshot is a bit confusing. Some questions which need answers kept together in one place:

  • Was the defconfig edited directly, or is it the default/original? It is a bad idea to edit defconfig directly, but if it is done correctly, then it will work. Even so, I’d just create a copy of defconfig and put the modified version in with a similar name, e.g., “custom_defconfig”.
  • What is the output of “uname -r” on your running system? If you are building modules, but not the kernel itself, then the suffix would likely be “-tegra”. As soon as you change any integrated (“=y” in config) feature you likely must also change that suffix via changing CONFIG_LOCALVERSION from “-tegra” to something else, e.g., “-custom”.
  • Are you using the exact source code version release of kernel source on the default running system and whatever you are compiling? Normally they need to be exact matches.
  • I don’t know if the posted defconfig has changes or not, it is better to name what symbols were altered, and if the altering was done via a config editor such as menuconfig or nconfig (rarely is it a good idea to directly edit a defconfig or .config due to dependencies). What symbols did you change in the defconfig, or after loading the defconfig and before kernel build?

For reference, L4T is what actually gets flashed, and is what one calls Ubuntu after it has the NVIDIA content added to it. JetPack/SDK Manager is just flash software for putting that content on the Jetson.

I did not yet look at the log, I am about to, but before doing so I want to mention that the NX models of developer kits don’t have eMMC, but they do have QSPI memory. That QSPI memory is used for boot content along with the “equivalent” of what a BIOS would do. Basically, when QSPI is flashed, you are flashing both a new BIOS and a boot chain. The QSPI content must be from the same major version release of L4T. For example, you can flash QSPI once with anything R36.x and use any rootfs from any R36.x release. If you have R35.x on QSPI, then boot to an R36.x rootfs would fail. On SD card models which do not have eMMC (dev kits) the SD card contains only the rootfs.

  • Is this a dev kit? A third party carrier board changes everything since it also means switching from QSPI flash to eMMC flash.

I’m not an expert with the SDKM logs, but failing to read the rcm_state could mean an invalid combination of hardware and software (maybe someone from NVIDIA could explain specifically what rcm_state failing could mean). This comes back to the question of starting with these two questions:

  • What is the exact hardware, e.g., if it is truly a dev kit, or if it is a third party carrier board and module with eMMC?
  • Is your host PC a VM? A VM will often fail similar to this due to losing USB (a VM’s pass through must reacquire USB after it is lost, but often it doesn’t do this correctly; Jetsons disconnect and reconnect during a flash).

My carrier is a custom carrier, but it is the same as the official carrier." rcm_state" This error is always there, it does not affect my flashing.

After trying according to what you said, I found a problem: as long as the new configuration item in “.config” contains “y”, it will appear “mtd_debug: error! : open()”, followed by “Error flashing qspi”.

I then set “CONFIG_LOCALVERSION” to “-custom” as you suggested and added "5.15.148-custom “after executing” make install ". I compared "5.15.148-custom "to the original “5.15.148-tegra” and I found that I was missing the “updates” folder. The “updates” folder in 5.15.148-tegra is generated by the execution of “apply_binaries.sh”.

The updates file contains the following information:
image

From my tests, whenever I added a “y” configuration item, it would cause my flash to fail. If all the new additions are m, the flash is a success. I’m confused.

Is there a problem with my process? But I need to add "wifi Driver ", which has both “y” and “m” configuration items. As for the configuration of "wifi driver ", I am not clear about the specific configuration items and dependencies, I hope you can tell me how to successfully match it out, but before that, I would like to know why the new configuration item contains “y”, so I can’t flash it. I hope you can help me solve these problems. Thank you very much.

At the same time, I still have a question, why the 36.4 version does not retain the wifi function of the 35 version? I think this is very unfriendly to users who need this feature.

Someone who knows the details of rcm_state would have to answer that. Can someone from NVIDIA comment on the exact cause of rcm_state failing during a flash? Maybe @WayneWWW or @KevinFFF.

Note that no kernel module change or failure would ever result in the rcm_state failure, but some of the configuration might cause a failure if the target is incorrect. This needs to be solved prior to most talk of adding a kernel and/or module.

We still need to know if the host PC is a VM, because this is notorious for similar flash failures.


Assuming this truly is an exact electrical layout replica to the dev kit carrier board there is still a remaining question: Is the module on this the eMMC model or is it the model of module which has the SD card on the module itself? This latter would have required purchasing a dev kit and removing the module to put it on the different carrier board since dev kit modules are not sold separately. This would also imply no SD card slot on the carrier board itself. Is this truly a carrier board which is an exact match to the dev kit? This would be very useful in debugging, but a module which is an eMMC model would still change some things.

If the module is an eMMC (commercial, not dev kit) type, then flashing the QSPI would be the incorrect way to flash. If the module is an SD card dev kit model, then the QSPI target should succeed.

I don’t know if this will matter, but I’ll point out something related to flash and various models of Jetson in case it is a configuration issue. Go to the Linux_for_Tegra/ directory and run the command “ls -l *.conf”. You will find the “human” named files are all symbolic links to files which are named after the combination of a module model and a carrier board model. In addition, when you read any of these files (they are plain text) you will find they have “include” statements which insert other configuration files. One of the files will be specific to only the module, and the other file will be specific only to the carrier board model.

The human readable “jetson-*.conf” symbolic links point to the technical name of the model instead of the common name; model designations are for the model of module plus model of carrier board plus perhaps revision number.

Module config files can be reused in several “combination” configuration files. The same is true for carrier board config files: This subset of config is modular and can be reused with different modules. It is true that some modules and carrier boards are not valid combinations, but if you examine how those files are combined, then you can build your own combination out of this.

When one flashes on command line the .conf file, with the .conf suffix name removed, is what the flash script uses as a target. For example, “ls -l jetson*.conf” might show file “jetson-agx-orin-devkit.conf”, which means “jetson-agx-or-devkit” is a valid flash target for the flash.sh command line tool. Does a similar file exist which is valid for your carrier board and your module type as a combined .conf file?

Next, are you using an initrd? The initrd flash has some similarities, but procedures for flashing with an external device changes flash procedures. The flash target would still need to be defined for your hardware combination.

Hello, although the error of “rcm_state” was reported, it did not affect my flashing. I do not care about this error. What I am concerned about now is why adding the driver whose configuration item is “y” will cause my flashing failure, and why will it affect mtd?

The current host is ubuntu22.04 instead of using a VM. The carrier uses NVME as the storage media.

I suspect that the driver is integrated into kernel Image because the configuration item of “y” was added, but I don’t know why the kernel was programmed into the kernel will affect my flashing. I hope you can cooperate with other technical staff to help me solve this problem, thank you very much.

Hello, I don’t want to worry about these problems now, I have no time to find out what I have modified to cause this series of problems, I have trouble you for this time.

I only have one request now, I need to add intel wifi driver to the BSP package of version 36.4, and then it can be flashed into my jetson device normally. You just need to tell me this series of procedures, and I will follow the procedures you taught me.

I hope you or other engineers can help me solve this only demand, thank you very much!

I am reading and answering as I go; for an actual answer you could skip to the end.

When you change a feature to be “=y”, or when you change CONFIG_LOCALVERSION, then you have invalidated all loaded modules. Sometimes they will work, but if you change an “=y”, and install a new kernle Image, you must also build and install 100% of the modules in the new location.

The command “uname -r” comes from the kernel telling you the base software version in the prefix, and the CONFIG_LOCALVERSION in the suffix. Your “uname -r” of “5.15.148-custom” says this is where that particular kernel Image is looking for modules:
/lib/modules/5.15.148-custom/kernel/

Had this still been “-tegra”, then the location would have been:
/lib/modules/5.15.148-tegra/kernel/

It is good that you used a different location because the original kernel can still use the 5.15.148-tegra content as a backup. Had you changed an “=y” feature, and not changed CONFIG_LOCALVERSION, the chances are that the new kernel would have errors and refuse to load modules from the original kernel. You would have had to have destroyed the original kernel’s modules, and you never want to do that until you are certain you don’t need the original (and often it is desirable to keep the original handy anyway in case of something like an OTA update problem).

Did you build and install all modules with the new kernel Image-custom?

Note that on some kernels there is out of tree content available. That content is only accessed if particular kernel configurations are used, so it doesn’t always matter. However, some of the NVIDIA-specific content might need this. When unpacking the source code you will typically see these tarball packages inside of the master public sources package (I’m showing the command to extract just those packages; the kernel_src.tbz2 is the public kernel source, the others might be present for out of tree content):

  • tar xvfj public_sources.tbz2 Linux_for_Tegra/source/kernel_src.tbz2
  • tar xvfj public_sources.tbz2 Linux_for_Tegra/source/kernel_oot_modules_src.tbz2
  • tar xvfj public_sources.tbz2 Linux_for_Tegra/source/nvidia_kernel_display_driver_source.tbz2

For example, it is likely the nvgpu.ko, is configured in kernel build, would require the `nvidia_kernel_display_driver_source.tbz2`` to be unpacked (you’d unpack each of those three packages after extracting them from the main public sources tarball).

It is also possible that the nvgpu.ko from the old build has been built such that it can load into the new -custom build without rebuild. Copy it over, monitor “dmesg --follow”, and see if it shows up as an error from “sudo depmod -a”. If no error, reboot. If an error exists, then remove that module from the -custom tree (building that module against the -custom configuration would then be required).

From an earlier post, I’m adding this comment (plus you are asking about wifi now; I think this applies to solving wifi):
No particular driver is allowed simultaneously as both =y (integrated) and =m (modular). This would be an error. If you check the output of “zcat /proc/config.gz”, and you see the driver as “=y”, then you should delete or remove the driver in the -custom area which is a module format. When you have the “=y” you cannot use “module procedures” to change anything related to that driver.

You would have to ask someone else about adding the kernel and modules via flash. The base documentation does have that in it though. You’d go here for the documentation specific to L4T R36.4:
https://developer.nvidia.com/embedded/jetson-linux-r3640

This would be under kernel customization, but you could skip the part about building the kernel with cross compile (you’ve already done this with native compile, but the procedure for putting this in the flash software is the same).

Hello, what I want is the implementation process of adding a new wifi driver. I cannot practice adding this wifi driver according to your reply.

I am following the 36.4 build kernel customization document, but when I run make -C kernel (per instructions) in the Linux_for_Tegra/source/ directory I cant seem to get it to run with a custom menuconfig. Where do I put the menuconfig from /proc/config.gz and where do I run the compile commands from?

@19179921356

There is no reason you can’t add a new driver unless you already have that driver. This part of the reply states that you can’t have a driver as both a module and integrated. Once you have the feature, you have it, and the format does not matter. It’s just that you cannot load a module if you answered with the ‘y’ key during configuration. Both the ‘m’ key and the ‘y’ key enable the feature, but it does so in different ways (ways which conflict if you try to do both).

However, what is the exact driver you want? Every driver has a “symbol”. The earlier mentioned CONFIG_LOCALVERSION is a symbol. When you edit with something like “make nconfig” or “make menuconfig” (both do exactly the same thing, except nconfig adds a search function), the description tells you a bit about that feature or driver, along with the symbol. The symbol will start with CONFIG_.

If you have a running Jetson, and you want to know every symbol that kernel knows about, and the current configuration, try this (I’m filtering for “wifi”):
zcat /proc/config.gz | grep -i 'wifi'

I’m guessing you need CONFIG_IWLWIFI. If you see this, then you have the feature as a module:
CONFIG_IWLWIFI=m

If you have this, then there is and can not be a module because the feature is permanently installed without a module:
CONFIG_IWLWIFI=y

If you have this, then you do not have the feature, but it is now possible to install this as a module:
# CONFIG_IWLWIFI=n
(the “#” just says it was commented out, but the =n is what really says the feature was not enabled during compile)

Wi-Fi is not needed during boot. Thus, you have a simple file copy if you want to install a Wi-Fi module (you do of course have to create the file first), and no initrd is involved (sometimes modules have to be duplicated into an initrd, but only for modules required to boot).

You cannot build a module file without configuring the kernel source to match the running kernel; the module would likely fail to load (there are exceptions). The steps you see are to configure an entire kernel source tree to match the running kernel. Then you use something like the nconfig or menuconfig (again, I recommend nconfig because then you can search, e.g., search for iwl_wifi) to add new module features. As soon as you add =y which was not there before you’ve probably lost the ability to reuse the other older modules (then install becomes more involved).

Here is a summary:

  • You must know the symbol. You’ll either see this in the description in an editor such as nconfig or menuconfig, or else perhaps by browsing through “zcat /proc/config.gz | grep -i 'wifi'”.
  • If the feature is enabled in the editor with the y key, then you already have the feature and there is no module build step available (you cannot install the same driver twice).
  • If the feature is enabled in the editor with the m key, then this enables as a module. At that point the feature/symbol/driver can be loaded or unloaded dynamically. Adding a module feature means older modules remain valid without installing an entire new kernel and set of modules.

Nothing says you cannot implement this. It just says there are conditions and requirements, that there is a “symbol”, and that if you build as a module using the original kernel’s config as a starting point, then installation is much easier.

@rachase

You should probably start a new thread. However, what do you mean custom menuconfig? When you configure with “make -C kernel”, you need to name the target; “kernel”, in the context you have given, is a subdirectory named “kernel” which you could “cd kernel” and get there. You use menuconfig, e.g., “make -C kernel menuconfig”, to run that editor.

The “-C ...somewhere...” is the location of the full source code. Some make targets of interest:

  • Image (this is the full kernel, but not modules).
  • modules_prepare (this propagates the configuration throughout the source; if the target is Image, then you don’t need this, it is automatic when you make Image).
  • modules (if the kernel source was properly configured, then this builds all modules).
  • menuconfig or nconfig run a configuration editor. Unless you make and save changes with these, then they have no effect. If there is an initial configuration, then these will start with that configuration.

Note that there is an option “O=/some/where”. Example:
make O=/some/temp/location -C /some/kernel/source/is/here nconfig

The above:

  • Expects source code at “/some/kernel/source/is/here/”.
  • Expects this to be an empty directory for temporary output, although the file .config can be there as an initial configuration:
    /some/temp/location/
  • Opens the nconfig editor where you can make changes. This is an example of a build target. Image, modules, and modules_prepare are other targets. Sometimes the order of target builds will matter.

The file .config at an output location is the “master” configuration. This can be edited with nconfig or menuconfig targets. You don’t start with nconfig or menuconfig, you instead start with something like defconfig or some other options (those options depend on some software versions). In L4T R36.x the usual place to start is with defconfig, and then use nconfig or menuconfig to set CONFIG_LOCALVERSION; if you are reusing this kernel, and only adding modules, then typically this would be -tegra. Type the command “uname -r”; the suffix is the current CONFIG_LOCALVERSION.

Mod can move my comment and your reply over to this thread? (sorry to hijack this one). Compiling a custom kernel on 36.3, missing some details

Ok I figured out the config part, thank you. Will continue discussion in new thread.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.