I am unable to boot into Ubuntu20.04 after an automatic nvidia update (similarly to some other posts here on regarding the recent updates 510 drivers).
In short:
I have tried with two different kernels, my original kernel (5.13.0-051300-generic) and a liquirix recent kernel (5.16.0-11.1-liquirix-amd64). In both cases, when using “sudo ubuntu-drivers autoinstall” nvidia-dkms-510 tells me that my kernel headers are not supported. (I have also tried 5.4.0.99, but it doesn’t boot at all in that kernel)
More detail:
I have tried to fully uninstall the nvidia drivers (after checking with “dpkg -l | grep nvidia” there are no packages listed). Then when I try to install with
sudo ubuntu-drivers autoinstall
It tries to install the 510 driver, but nvidia-dkms-510 tells me that my kernel headers are not supported. Then it fails the installation since dmks is not configured.
I have also tried the steps given in related posts (I can only post one here), but nothing has worked for me so far.
The kernel 5.13.0-051300-generic was built for 21.10 so is incompatible. Please remove it.
The liquorix kernel is compatible but the headers package seems to be either broken or not installed at all. Please boot to the liquorix kernel and run sudo apt install --reinstall linux-headers-$(uname -r)
Thanks for your response. Reinstalling fixed this error. But it now gives this error:
ERROR (dmks apport): binary package for evdi: 5.2.14 not found
Installation seems to continue however. So I tried rebooting, but it still gives a blank screen with (maybe unrelated):
USBC000:00: failed to reset PPM!
USBC000:00: PPM init failed (-110)
Running nvidia-smi in tty gives:
Failed to initialize NVML: Driver/library version mismatch
[ 40.265039] NVRM: API mismatch: the client has the version 510.54, but
NVRM: this kernel module has the version 510.47.03. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
Seems you have installed different driver versions over one another. Please uninstall the runfile version and check if you can purge/reinstall the repo packages.
I ran <runinstall_file> --uninstall (which is the 510.47 version, it seems that that is also the driver being installed by ubuntu-drivers autoinstall).
It tells me there is no nvidia driver installed, then exits.
I cannot find any trace of the 510.54 driver. Is there any other way I can find / remove it?
I wonder where that comes from. Maybe purge the packages first, then use the 510.54 runfile installer to overwrite the .54 version, then use it again to uninstall, check if all files are gone, then install the repo driver again.
I tried your suggestion and it did install (giving some error “binary package for nvidia: 510.54 not found”, but seemingly installing correctly), then uninstalling finds the driver and completes successfully, but, the files remain in the last two locations… (Example of usr/lib/… below). So it does not seem to remove the files related to 510.54 properly.
Thanks, your suggestion worked and the .54 is now removed. I ran sudo ubuntu-drivers autoinstall to install the 510.47 driver.
However, it still doesn’t boot (just a blank screen and the white bar blinking again). Although the NVidia driver seems to be installed now as nvidia-smi returns the 510.47 driver working correctly.
Hi, thanks for your support! Most parts of my system are up and running again with Nvidia drivers. Thanks!
I am however still stuck on one issue with regards to the NVidia OpenCL libraries. Some applications are throwing: error while loading shared libraries: libOpenCL.so.1: cannot open shared object file: No such file or directory
It seems that I am missing libOpenCL.so.1.0.0. There is a symlink, but the actual library is missing.
Looking online, it seems that it should be installed with the libnvidia-compute-510 package, but it is missing for me. Is there anything I can do to fix it?
That’s only the OpenCL loader, doesn’t belong to the nvidia driver but ocl-icd. It’s just missing the compatibility link sudo ln -s /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
I have tried your solution. The problem is that there is no file /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0, i.e., the library is missing as in the image below:
I have tried to reinstall ocl-icd-opencl-dev, but that only seems to recreate the symbolic link libOpenCL.so without providing the library (libOpenCL.so.1.0.0) itself. Do you know how I can retrieve the underlying library?