i am using Ubuntu 22.04 LTS. My GPU is RTX A5000. I am facing issues with NVIDIA driver from the apt repo. It restarts my PC even when plotting graphs. it does not recognise my GPU’s name in the 'Software and Update".
The same is true for both 515 and 510 driver. However it does recognise my GPU’s name on nvidia-smi
.
╭─ ~/Downloads 1 ✘ 55s base
╰─ nvidia-smi
Mon Oct 10 16:03:39 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02 Driver Version: 510.85.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:65:00.0 On | 0 |
| 30% 35C P8 26W / 230W | 614MiB / 23028MiB | 14% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1784 G /usr/lib/xorg/Xorg 92MiB |
| 0 N/A N/A 2679 G /usr/lib/xorg/Xorg 277MiB |
| 0 N/A N/A 2954 G /usr/bin/gnome-shell 101MiB |
| 0 N/A N/A 3402 G ...AAAAAAAAA= --shared-files 31MiB |
| 0 N/A N/A 5879 G ...174681230305428951,131072 84MiB |
+-----------------------------------------------------------------------------+
╭─ ~/Downloads ✔ base
╰─
I decided to install the latest driver from the official website.
But I get some error. Here is the log file.
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Mon Oct 10 15:52:14 2022
installer version: 515.76
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
nvidia-installer command line:
./nvidia-installer
Using: nvidia-installer ncurses v6 user interface
-> Detected 16 CPUs online; setting concurrency level to 16.
ERROR: An NVIDIA kernel module 'nvidia-drm' appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occurred that has corrupted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot your computer.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.