During kernel compile was the option “CONFIG_LOCALVERSION” set to “=-tegra”? If not, then it won’t be able to find the kernel modules. The output of “uname -r” is from the base kernel version, plus the CONFIG_LOCALVERSION. Modules are searched for at: /lib/modules/$(uname -r)/kernel
If you connect via ssh or serial console or non-GUI console, can you see the output of “uname -r”?
I have not tried that build script, so I don’t know for sure, but it might set CONFIG_LOCALVERSION. Do you have any access to the command line with that kernel, and if so, what does it say for the output of “uname -r”? Additionally, what do you see from “ls /lib/modules/$(uname -r)/kernel”?
Looks like there is no issue with CONFIG_LOCALVERSION. I can’t say if using the R34.1.1 source instead of R34.0.1 source is an issue, but this is possible. After the display should have started, but fails, can you provide a copy of the following:
Output of “dmesg”. Example to create a log file of this: dmesg 2>&1 | tee log_dmesg.txt
Log file “/var/log/Xorg.0.log”.
If you have ssh access you should be able to get a copy of that to a different host PC. The logs will likely provide information on whether a module failed to load due to a kernel difference, or for some other reason.
This is basically just repeating that there is a “version” issue, though I couldn’t say exactly where. A subset of the dmesg log:
[ 9.984872] nvgpu: Unknown symbol nvlink_register_link (err -2)
[ 9.985145] nvgpu: Unknown symbol nvlink_unregister_device (err -2)
[ 9.985379] nvgpu: Unknown symbol nvlink_unregister_link (err -2)
[ 9.986353] nvgpu: Unknown symbol nvlink_enumerate (err -2)
[ 9.986695] systemd-journald[297]: Received client request to flush runtime journal.
[ 9.991819] nvgpu: Unknown symbol nvlink_transition_intranode_conn_off_to_safe (err -2)
[ 9.991943] nvgpu: Unknown symbol nvlink_register_device (err -2)
[ 10.014015] nvgpu: Unknown symbol nvlink_train_intranode_conn_safe_to_hs (err -2)
[ 10.298572] nvgpu: Unknown symbol nvlink_shutdown (err -2)
[ 10.298770] nvgpu: Unknown symbol nvlink_register_link (err -2)
[ 10.299079] nvgpu: Unknown symbol nvlink_unregister_device (err -2)
[ 10.299353] nvgpu: Unknown symbol nvlink_unregister_link (err -2)
[ 10.299803] nvgpu: Unknown symbol nvlink_enumerate (err -2)
[ 10.299971] nvgpu: Unknown symbol nvlink_transition_intranode_conn_off_to_safe (err -2)
[ 10.300329] nvgpu: Unknown symbol nvlink_register_device (err -2)
[ 10.300557] nvgpu: Unknown symbol nvlink_train_intranode_conn_safe_to_hs (err -2)
There were lots of symbol errors, not just those above, but those were for the GPU. A “symbol” is a fingerprint to a function to call, typically something like a combination of function name and arguments. When you configure a kernel and build it you are essentially selecting a series of symbols (think of each CONFIG_ item of a configuration as a bookmark into a group of symbols). An “unknown” symbol is one which is missing.
“Missing” symbols can be fulfilled either as being compiled into the kernel Image, or else being provided as a module. Keep in mind that if a module is missing or in some way incompatible with loading in that kernel Image, then this is also a missing symbol set even if you built the module. There is a good chance that something is wrong with your self-compiled kernel’s configuration, or the installation of modules (which could in turn be an issue of either invalid reuse of modules, or else new modules missing, or the modules being present but the kernel not knowing about them and in need of sudo depmod -a).
These modules install after running apply_binaries.sh script.
These modules located inside this package:
./kernel/nvidia-l4t-display-kernel_5.10.65-tegra-34.1.1-20220516211757_arm64.deb