I am using the manual sources.
Using the nvbuild.sh works. A kernel is compiled.
Instead of creating a kernel supplements tar as described in step 7 I install the modules as follows:
This all works perfectly fine. We added custom kernel modules and everything runs properly.
Now since we have noticed that a lot unneccesary stuff is included in the kernel config I for example disabled the “Network Device Support → Wireless LAN” option in the kernel. Everything else is kept the same.
As soon as I run the generated kernel and modules on the device I get the following warnings from nvgpu.ko:
[ 10.174511] nvidia: loading out-of-tree module taints kernel.
[ 10.177739] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 10.178623] nvidia: disagrees about version of symbol nvhost_get_default_device
[ 10.178784] nvidia: Unknown symbol nvhost_get_default_device (err -22)
[ 10.178974] nvidia: disagrees about version of symbol fget
[ 10.179093] nvidia: Unknown symbol fget (err -22)
[ 10.179252] nvidia: disagrees about version of symbol fd_install
[ 10.179397] nvidia: Unknown symbol fd_install (err -22)
[ 10.179674] nvidia: disagrees about version of symbol wake_up_process
[ 10.179815] nvidia: Unknown symbol wake_up_process (err -22)
[ 10.180092] nvidia: disagrees about version of symbol iterate_fd
[ 10.180257] nvidia: Unknown symbol iterate_fd (err -22)
[ 10.180563] nvidia: disagrees about version of symbol __close_fd
[ 10.180745] nvidia: Unknown symbol __close_fd (err -22)
[ 10.181399] nvidia: disagrees about version of symbol nvhost_syncpt_unit_interface_get_aperture
[ 10.181644] nvidia: Unknown symbol nvhost_syncpt_unit_interface_get_aperture (err -22)
Is there anything special about the build of nvgpu.ko that it does not get informed about my kernel config changes?
yes it seems to be rebuilt. The date changes and I noticed that the guide point 5 mentions this:
Replace Linux_for_Tegra/rootfs/usr/lib/modules/$(uname -r)/kernel/drivers/gpu/nvgpu/nvgpu.ko with a copy of this file:
$kernel_out/drivers/gpu/nvgpu/nvgpu.ko
I had a step to copy it finally in my code initially but after the file was still 245MB I noticed that the module install step:
make -C $SOURCE_DIR/kernel/kernel-* INSTALL_MOD_STRIP=1 LOCALVERSION="-tegra" ARCH=arm64 O=$KERNEL_DIR modules_install INSTALL_MOD_PATH=$ROOT_DIR/usr
…copies it too, so I tested with copying and without copying, and also without the INSTALL_MOD_STRIP=1 option. No matter what I do, as soon as I remove the wifi modules the nvgpu.ko complains that there are missing symbols.
Unfortunately still the same result with the plain kernel from the public sources.
I entirely removed the output and source directory for the build to make sure that there are no leftovers.
Here is the script part that is responsible to build, I hope you can figure the variables:
To answer your previous question. Plain kernel means no patches, just the nvidia source code, but with my config. I already told you before that with the default config everything is alright but as soon as you modify the kernel config the issue appears. But it will be clearer now:
I have made an error during my initial analysis.
Since I did not know which kernel module the “nvidia” message belonged to I deleted the nvgpu.ko as it described itself as nvidia. But that was a mistake. The actual module in fact is called “nvidia.ko”
It is placed here:
Linux_for_Tegra/rootfs/usr/lib/modules/5.10.104-tegra/extra/opensrc-disp/nvidia.ko
I do not have that module anywhere in my build folder except in the rootfs. I assume it gets created by the apply_binaries.sh? It is already present in my minimal rootfs which is derived from ubuntu base + apply_binaries.sh
All the modules in that folder do not match my build kernel of course. It does not get built by nvbuild.sh.
Again: Everything works as long as I do not edit the kernel config.
Seems we are back to this older question from the DP 5.0.1:
We already identified the issue. My rootfs includes the extra folder from the l4t but my kernel does not match it.
As long as I use the wrong nvidia.ko I’ll have that issue.
If the source for the nvidia.ko is included in the public sources, how can I build the modules in the extra folder? nvbuild.sh doesn’t.
Have you identified which line you added in kernel defconfig is causing the problem?
Please follow the guidance here. This part is missing in document. We will add it back.
The toolchain version may need correction.
The NVIDIA-kernel-module-source-<Version>.tar.xz source code is supplied as a .xz file. Untar the file. Its location is <NV_WORKSPACE>/drive-linux_src/.
Export the required variables as per the Linux kernel compilation steps: