How to fix NVML: Driver/library version mismatch _without_ rebooting

My Ubuntu server performed unattended upgrades, and I cannot use nvidia libraries any more. I know that rebooting will fix the problem, but I am not able to reboot the machine remotely, so I’d like to find a way to undo the unattended upgrades instead.

Can anyone suggest a solution?

This is a similar question, but no answer was given:

This seems to offer a solution for downgrading:

But I cannot get it to work. My current running driver is 535.104.05, as found by running dmesg:

[1214031.466153] NVRM: API mismatch: the client has the version 535.104.12, but
                 NVRM: this kernel module has the version 535.104.05.  Please
                 NVRM: make sure that this kernel module and all NVIDIA driver
                 NVRM: components have the same version.

Trying to re-install this specific driver does not work:

sudo apt install nvidia-driver-535 cuda-drivers=535.104.05-1
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 cuda-drivers : Depends: cuda-drivers-535 (= 535.104.05-1) but 535.104.12-1 is to be installed
E: Unable to correct problems, you have held broken packages.

What can I do? Any suggestion is very welcome.

Decide which driver you need. Only one of them cuda-driver, or nvidia-driver, can be installed.

sudo apt purge DRIVER_TO_DELETE
sudo apt reinstall DRIVER_OF_CHOICE

sudo systemctl isolate multi-user.target

lsmod | grep nvidia - to identify the modules needing to be reloaded.
sudo moprobe --remove ALL_MODULES_IDENTIFIED_ABOVE
sudo insmod ALL_MODULES_IDENTIFIED_ABOVE

sudo systemctl isolate graphical.target

1 Like

Thanks! you saved my life.

I had to stop all running calculations first.

insmod did not work, but “modprobe nvidia” did.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.