Please share the result of “lsmod” and “uname -r”. If lsmod is empty, then it is just due to the kenrel you built didn’t match the lib/modules folder name.
But I do not think it is the match between ko and Image, because I can manually insmod nvgpu.ko is ok.
From the status of servie, seems that the errors are due to some missingfiles. ● nvpmodel.service - nvpmodel service
Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: failed to write PARAM GPU_POWER_CONTROL_ENABLE: ARG GPU_PWR_CNTL_EN: PATH: /sys/devices/gpu.0/power/control VAL: on Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: Error opening /sys/devices/gpu.0/devfreq/57000000.gpu/available_frequencies: 2 Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: failed to read PARAM GPU: ARG FREQ_TABLE: PATH /sys/devices/gpu.0/devfreq/57000000.gpu/available_frequencies Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: Error opening /sys/devices/gpu.0/power/control: 2 Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: failed to write PARAM GPU_POWER_CONTROL_DISABLE: ARG GPU_PWR_CNTL_DIS: PATH: /sys/devices/gpu.0/power/control VAL: auto Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: failed to set power mode! Mar 31 09:53:35 gb-nano nvpmodel[4849]: NVPM ERROR: optMask is 2, no request for power mode Mar 31 09:53:35 gb-nano systemd[1]: nvpmodel.service: Main process exited, code=exited, status=255/n/a Mar 31 09:53:35 gb-nano systemd[1]: nvpmodel.service: Failed with result ‘exit-code’. Mar 31 09:53:35 gb-nano systemd[1]: Failed to start nvpmodel service.
No, you didn’t get my point. Please be aware that this is not some kind of new issue to me… already saw it many times.
No need to check any syslog… they are not related…
When you build out the kernel, your “uname -r” needs to match /lib/modules/module_path_name.
Yes, that is the issue… That “+” is the problem here. This is just string comparison… These two need to be fully matching to let the ko files in that folder gets loaded automatically…
Incidentally, that pesky “+” tends to be from some of the non-NVIDIA releases of kernel as a way to get people to increment or change the CONFIG_LOCALVERSION. One can disable that via a number of methods, but I’ll recommend (within the kernel source) to edit file: scripts/setlocalversion
Modify function “scm_version()” as follows to return early:
scm_version()
{
local short
short=false
**return**
The sources which NVIDIA publishes won’t need that. A major point to consider though is that whenever you change a feature which is integrated into the kernel (not in module format), then there is a strong chance that you cannot reuse the modules, and so you’d want to change CONFIG_LOCALVERSION to not reuse the old modules.
Also, I recommend adding a second boot entry before the default, making that entry the new default, using a modified Image file name (e.g., Image-4.9.253-modified), and leaving the old content in place until you know it works (and of course you’d need new modules too if you install a modified Image…or at least most of the time).