Hello! I tried installing with the runfile, but failed. I have kept the logs, but I don’t understand what is wrong, other than possibly 1) it doesn’t work with the oem kernel, 2) My kernel was compiled with another version of cc than the one used in the script, or 3) I passed the -m=kernel-open
option?
The cuda_installer.log
is reasonably short, so I include it here:
[INFO]: Adding driver option -m=kernel-open
[INFO]: Driver not installed.
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
[INFO]: Initializing menu
[INFO]: nvidia-fs.setKOVersion(2.18.3)
[INFO]: Setup complete
[INFO]: Installing: Driver
[INFO]: Installing: 545.23.08
[INFO]: Executing NVIDIA-Linux-x86_64-545.23.08.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd --kernel-module-build-directory=kernel-open 2>&1
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed. Consult the driver log at /var/log/nvidia-installer.log for more details.
[ERROR]: Install of 545.23.08 failed, quitting
The nvidia-installer.log
is over 175k, so I’ll upload it, and hope that’s ok…
There are warnings about that my cc has another version than that the kernel was compiled with, but I don’ see if that is the error, or something else.
Could the error be triggered by the option -m=kernel-open? If I skip this flag, will I still be able to compile CUDA programs, or what is this needed for?
Best regards, David
nvidia-installer.log (171.1 KB)
Some system info, and more detail what I did:
- ubuntu release *ok*
22.04.3
- kernel *ok?*
6.1.0-1020-oem
- gcc version *ok*
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
- glibc *ok*
2.35
sudo apt install linux-headers-$(uname -r)
- log
[...]
linux-headers-6.1.0-1020-oem is already the newest version (6.1.0-1020.20).
linux-headers-6.1.0-1020-oem set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 123 not upgraded.
sudo apt install linux-libc-dev
- log
[...]
The following packages will be upgraded:
linux-libc-dev
[...]
Preparing to unpack .../linux-libc-dev_5.15.0-94.104_amd64.deb ...
Unpacking linux-libc-dev:amd64 (5.15.0-94.104) over (5.15.0-91.101) ...
Setting up linux-libc-dev:amd64 (5.15.0-94.104) ...
wget https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda_12.3.2_545.23.08_linux.run
I booted into runlevel 3 (replaced by systemd in modern ubuntu):
sudo systemctl set-default multi-user.target
reboot
Then I disabled the nouveau driver by
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
cat /etc/modprobe.d/blacklist-nvidia-nouveau.conf
blacklist nouveau
options nouveau modeset=0
sudo update-initramfs -u
reboot
In runlevel 3 I did:
sudo sh cuda_12.3.2_545.23.08_linux.run -m=kernel-open
After this, the error message came, and I have no clue how to proceed.