Ubuntu 22.04.1 Nvidia Driver (Open Kernel) Nvidia-Driver-515-Open Issue

sudo ubuntu-drivers autoinstall
[sudo] password for username:
Traceback (most recent call last):
File “/usr/bin/ubuntu-drivers”, line 513, in
greet()
File “/usr/lib/python3/dist-packages/click/core.py”, line 1128, in call
return self.main(*args, **kwargs)
File “/usr/lib/python3/dist-packages/click/core.py”, line 1053, in main
rv = self.invoke(ctx)
File “/usr/lib/python3/dist-packages/click/core.py”, line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/usr/lib/python3/dist-packages/click/core.py”, line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/usr/lib/python3/dist-packages/click/core.py”, line 754, in invoke
return __callback(*args, **kwargs)
File “/usr/lib/python3/dist-packages/click/decorators.py”, line 84, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File “/usr/lib/python3/dist-packages/click/core.py”, line 754, in invoke
return __callback(*args, **kwargs)
File “/usr/bin/ubuntu-drivers”, line 432, in autoinstall
command_install(config)
File “/usr/bin/ubuntu-drivers”, line 187, in command_install
UbuntuDrivers.detect.nvidia_desktop_pre_installation_hook(to_install)
File “/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py”, line 839, in nvidia_desktop_pre_installation_hook
with_nvidia_kms = version >= 470
UnboundLocalError: local variable ‘version’ referenced before assignment

1.when i run sudo ubuntu-drivers autoinstall this error appear.
2.When i choose using nvidia (open kernel) metapackage from nvidia-driver-515-open(proprietart,tested) my pc will hang after restart and cannot do anything.
3.to solve this error i need to reinstall ubuntu or if have luck i can use recovery and change the driver to default driver.
nvidia-bug-report.log.gz (165.9 KB)

1 Like

hello,
i tried to install nvidia-driver today on 22.04 aswell, and incurred into same/similar error.
what did is just “sudo apt-get upgrade” and restart, afterwards the error went away and i could install nvidia-driver , but did not use “autoinstall”; instead i went to “additional drivers” and selected from there the newer package “Nvidia-515”.
after restart, would check , and if desktop would not load, then i change GUI from Wayland to Xorg, which helped in my case (unity desktop would not load either properly with nvidia-driver)

Hello, I also see the same error on Kubuntu 22.04 when running

sudo ubuntu-drivers install

I ran into the same issue after installing 515 drivers for my RTX A4000.
I was able to make my machine boot again by disabling secure boot (not sure if that was necessary) and changing a setting in the “Storage” section of the BIOS. Changed “RAID On” to “AHCI/NVME”. Now it boots again and I can use the GPU for computations and such things. But now my Soundcard and Bluetooth no longer work. So I woudl appreciate a better solution

This is preventing me from upgrading to 520 driver which appears to be required for the 5.17.0 ubuntu kernel. X starts and then hangs when booing /boot/vmlinuz-5.17.0-1020-oem kernel.

i have same issue nvidia drivers won’t boot if i use xorg boots fine. i am on kubuntu lts 22.04.
i reverted to a raid 1 back up from 2 weeks ago it worked fine until i updated the system.

Found workaround.

Hi there, we came across this issue during an install of a new Ubuntu.

The error is from the way the installer tries to parse the version (an int) from the driver name. The code assumes the version is the last part of the driver name, which in most cases is fine e.g. nvidia-driver-520. In my scenario the driver name was nvidia-driver-520-open and the code ends up trying to parse open as an int and fails, throwing the undefined error a few lines later as version failed to be parsed.

Solution:
This will only work if you have access to /usr/lib/python3/dist-packages/UbuntuDrivers and have edit permission.
offending line: /usr/lib/python3/dist-packages/UbuntuDrivers/detect.py:835
find following line:
version = int(package_name.split('-')[-1])
modify to:
version = int(package_name.split('-')[2])
Note this is assuming the version is the 3rd word in the driver name split by ‘-’

Understandably when trying to parse values from names only these issues can happen from time to time.
A long-term fix would need the insight of the naming conventions of current and for future driver package names.

4 Likes

I’m having the exact same issue. I updated when prompted, only to get a black screen, no boot scenario afterwards. I’ve reinstalled Ubuntu literally about 50 times now and tried to install every available driver, with every possible installer method. All result in black screen after reboot. Nvidia drivers cannot currently be installed on Ubuntu Jellyfish.

I might have to install WINDOWS to WORK 😭

This thread has given me peace however. I will no longer chase a solution to an unfixable problem.

I’ll check back next driver update. 🙏

2 Likes

Same over here. Please don’t tell me everyone is giving up on this issue?
I just purchased this laptop for the sole purpose of utilising the GPU to speed up ML model training, and now you’re saying this is not going to happen? I would rather burn this laptop than work on Windows.

Has anyone attempted the driver edit mentioned above? I tried to edit the file “/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py”, but was denied permission, and after a few other attempts at driver installations and workarounds, my whole system is very laggy and my file explorer won’t open at all.

I will attempt a few other things, before possibly having to re-install Ubuntu.

[Update]
Like @levimulkey, I have tried every combination of driver installations I can find. While I did manage to install the " nvidia-driver-520-open" driver as well as " nvidia-driver-520", both via 'Software & Updates", when running “nvidia-smi” there is still a “No Driver Found” message?

Via the command line, I managed to install the Nvidia driver/package which I opened and registered on, and even found version 11.8 was installed with nvidia-smi. Alas the CUDA toolkit was still not installed. So after installing, the Nvidia driver was once again not found, or at least not recognised by a python test script.

The purpose of this driver installation is not for gaming and HAS TO function with CUDA/CUDNN, which seem to be clashing with each other.

My entire system has black-screened multiple times as well as file-explorer freezing permanently, and I have also re-installed Ubuntu many times now.

Is it my turn to give up on this also, and go over to my nightmare OS. Has Nvidia always been Windows focussed?

I’m running Ubuntu 22.04 with the standard 5.15.0.52 kernel and both the 515 and the 520 drivers (NOT the 515-open or 520-open) from Ubuntu’s standard repositories work fine on GTX970 and RTX 3080 hardware. I use the Software & Updates/Additional Drivers to select the driver, and reboot. On a fresh Ubuntu install, if you don’t elect to install the Nvidia drivers within the install, the first reboot will take the word “nomodeset” added to the kernel parameters in the grub boot menu, so the nouveau driver does not produce a black screen. Then run the Software & Updates to select a driver so “nomodeset” is no longer needed.

With the Nvidia driver installed and working, I’d suggest using the Nvidia …run script to install CUDA. Uncheck any offer of an Nvidia driver, and use the options to skip any system location for libs. Turn off the icon option too. Then you may take ownership of /usr/local and run the …run script as a normal user, not an admin. After installation, everything will be in /usr/local/cuda-11.8, so follow the recommendations to add those locations to your PATH and LD_LIBRARY_PATH.
Restore the permissions on /usr/local, and you should have a working CUDA that is independent of the Nvidia driver in use, and works through kernel updates (which rebuild the driver).
Search the askubuntu.com site for more installation answers.

Okay so just a final follow-up with all the steps taken to fully solve the driver to CUDA issue on my system:
Hope it leads someone in the right direction.

System Specs:

MSI - 11th Gen Intel® Core™ i7-11800H @ 2.30GHz × 16
NVIDIA Corporation GA106M [GeForce RTX 3060 Mobile
16Gb RAM running Ubuntu 22.04

STEPS TAKEN:

Install Anaconda:

sudo apt install curl
mkdir tmp
cd tmp
curl -O https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh
bash Anaconda3-2022.10-Linux-x86_64.sh
source ~/.bashrc

conda -V
conda update conda
conda update anaconda

Install Nvidia Driver:

V01 [2021]:

sudo ubuntu-drivers list				# List available drivers
sudo ubuntu-drivers install nvidia:515 	# Install specified driver
reboot

Create and run Conda Virtual with Spyder 5:

INSTALL

conda create -c conda-forge -n spyder-env spyder numpy scipy pandas matplotlib sympy cython

USE CONDA-FORGE:

conda activate spyder-env
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict

UPDATE:

conda update -n base conda
conda activate spyder-env
conda update spyder

CHECK:

conda list Spyder$

Install CUDA Toolkit:

Install conda-forge:

conda config --add channels conda-forge
conda config --set channel_priority strict

Install CUDA & CUDNN:
With CONDA:

# - LIST -
conda search cudatoolkit --channel conda-forge
conda search cudnn --channel conda-forge
# - Install -
conda install cudatoolkit					<-- 'nvcc' still version not found
conda install cudnn

Install missing packages:

conda install pytorch
  • Included cudatoolkit / cudnn / matplotlib / pandas / numpy
  • Non Included: torchvision / tensorflow-gpu / sklearn / keras
conda install torchvision
conda install tensorflow-gpu				<-- GPU: 1
conda install -c conda-forge scikit-learn

================================================================

Check Installations:

================================================================
conda activate spyder-env

Driver Check:

nvidia-smi
cat /proc/driver/nvidia/version

CUDA Check:

conda list cudatoolkit		<-- 11.7.0
conda list cudnn			<-- 8.4.1.50
nvcc --version				<-- not found

Tensoflow Check:

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

CUDA Check:

>>> import torch
>>> print(torch.version.cuda)	<-- 11.2

Sub-Packages Check:

import tensorflow as tf
import keras
import torch
import sklearn
import matplotlib
import pandas
from torchvision.models import resnet50

================================================================

Final Results:

================================================================
cat /proc/driver/nvidia/version

NVRM version: NVIDIA UNIX x86_64 Kernel Module  515.65.01  Wed Jul 20 14:00:58 UTC 2022
GCC version:

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   30C    P0    22W /  N/A |      5MiB /  6144MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1922      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

conda list cudatoolkit
conda list cudnn

# Name                    Version                   Build  Channel
cudatoolkit               11.7.0              hd8887f6_10    conda-forge
cudnn                     8.4.1.50             hed8a83a_0    conda-forge

nvcc --version

Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit

In Python, the GPU is now recognised:

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Returns: Num GPUs Available:  1

There are 2 issues here.

  1. The ubuntu-drivers script mistakenly parsing “open” as a int ( Solution discussed above)
  2. The other is even when installed through the Additional Drivers tab or by fixing script, I ended up with a kernel panic with “Out of Memory … press any key to continue”. Where the only option was reboot and get to grub menu and choose a older kernel version. Then reverting to xorg nouveau driver.

For Issue 2 - Downgrading to nvidia driver 470 worked for me.

I solved the problem by moving to a newer distro of Ubuntu.

@levimulkey
Could you please share which distro is that you moved to ?
Does it mean you are able to use the latest 520 drivers in that distro ?

There are on-going threads in launchpad for these in ubuntu

After, downgrading to 470, Tried again from GUI (Additional Drivers) to move to 515. This worked!

Note: I did follow these steps to handle suspend issue with 470 before trying 515 again.

sudo systemctl stop nvidia-suspend.service
sudo systemctl stop nvidia-hibernate.service
sudo systemctl stop nvidia-resume.service

sudo systemctl disable nvidia-suspend.service
sudo systemctl disable nvidia-hibernate.service
sudo systemctl disable nvidia-resume.service

sudo rm /lib/systemd/system-sleep/nvidia