Unable to install NVIDIA drivers for 3090 on Ubuntu 20.04

kenrick_fernandes · December 1, 2022, 7:51pm

Hello,

I have a 3090 installed on a machine with Ubuntu 20.04. I have been trying to install Nvidia drivers (both manually, using the .run file, and through “Software and Updates”), but cannot get the drivers to work.

This is the my kernel version : 5.15.0-53-generic
I have tried 520, 515 and 510 versions (open-kernel and metapackage versions), as well as 515.76 .run file from the official drivers website.
With 515,520 open-kernel versions - the system outputs this when running nvidia-smi :

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

And with the other non open-kernel versions the system boots to a black screen.

I have tried several things, including the solution from this thread : Cannot get nvidia driver (520, 515, 515-open, or 510) working in Ubuntu 22.10 , but to no avail.

I have blacklisted Noveau and all the other steps mentioned in the thread as well, but none of them work. (I have not yet disabled Secure Boot, though)

What process should I follow to install nvidia-drivers 515 or above on my system?

kenrick_fernandes · December 4, 2022, 3:14pm

Here is a bit more additional information.

Currently, I have the 515 open-kernel drivers installed and it boots up correctly, but upon running nvidia-smi I get this error; this happens with 520 open-kernel version as well. :

Unable to determine the device handle for GPU 0000:01:00.0: Not Found

There is no /etc/modprobe.d/nvidia-graphics-drivers-kms.conf on my system.

This is the contents of /lib/modprobe.d/nvidia-kms.conf :

This file was generated by nvidia-prime

Set value to 1 to enable modesetting

options nvidia-drm modeset=1

The output of nvidia-settings :

ERROR: A query to find an object was unsuccessful
ERROR: Unable to load info from any available system
(nvidia-settings:9774): GLib-GObject-CRITICAL **: 10:06:17.428: g_object_unref: assertion ‘G_IS_OBJECT (object)’ failed
** Message: 10:06:17.430: PRIME: Requires offloading
** Message: 10:06:17.430: PRIME: is it supported? yes
** Message: 10:06:17.447: PRIME: Usage: /usr/bin/prime-select nvidia|intel|on-demand|query
** Message: 10:06:17.447: PRIME: on-demand mode: “1”
** Message: 10:06:17.447: PRIME: is “on-demand” mode supported? yes

The output of dmesg | grep nvidia :

[    1.759777] nvidia: loading out-of-tree module taints kernel.
[    1.762247] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    1.804698] nvidia-nvlink: Nvlink Core is being initialized, major device number 510
[    1.805227] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    1.862602] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64  515.65.01  Release Build  (dvs-builder@U16-T11-05-2)  Wed Jul 20 13:43:59 UTC 2022
[    1.946579] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[    4.143063] NVRM: Open nvidia.ko is only ready for use on Data Center GPUs.
[    4.143070] NVRM: To force use of Open nvidia.ko on other GPUs, see the
[    4.424820] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[    4.424912] [drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device
[    4.442684] audit: type=1400 audit(1670014530.519:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=897 comm="apparmor_parser"
[    4.442688] audit: type=1400 audit(1670014530.519:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=897 comm="apparmor_parser"
[    4.503574] nvidia-uvm: Loaded the UVM driver, major device number 507.

Any help would be appreciated!

kenrick_fernandes · December 4, 2022, 5:17pm

I am attaching the nvidia-bug-report and nvidia installer logs here :
nvidia-installer.log (42.5 KB)
nvidia-bug-report.log (3.9 MB)

kenrick_fernandes · December 7, 2022, 3:17pm

I have recently purged my system and tried to reinstall the drivers. This is the procedure I followed :

purge everything nvidia related using apt-get
Uninstall Nouveau drivers
install nvidia-drivers-525 using apt

I rebooted the machine and the GUI seems to be broken, this is what I can see on boot :

(Please note that this happened with 520 and 510 non open-kernel versions as well)

I have ssh open on the machine, so I can remotely ssh in and use the terminal.
nvidia-smi runs properly, this is the output :

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11    Driver Version: 525.60.11    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  Off |
|  0%   39C    P8    23W / 450W |      1MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Unfortunately, any operation I run on the GPU leaves the machine hanging, whether i run it from base or through an ngc docker. (even though torch,tensorflow etc can see that cuda and gpu is available)
For eg, a simple torch.rand(1).to(“cuda”) runs indefinitely.

I am attaching the nvidia-bug-report and installer logs here :
nvidia-bug-report.log (2.3 MB)
nvidia-installer.log (42.5 KB)

Thank you!

Topic		Replies	Views
510.54-RTX3090-Ubuntu 20.04 Unable to load the ‘nvidia-drm’ kernel module Linux kernel , ubuntu	29	25452	March 2, 2022
Ubuntu 22.04 installation driver error Nvidia[A10] Linux	4	3513	May 22, 2024
Nvidia driver installed, but nvidia-smi says no devices found Linux ubuntu	7	7163	April 21, 2025
Nvidia driver is not working on Ubuntu 22.04 Linux linux , linux-driver	25	39277	February 20, 2025
Nvidia-settings gives errors 3090ti egpu dell laptop Ubuntu Linux ubuntu	8	1261	August 15, 2022
Nvidia-smi show "No devices were found" after installing NVIDIA-Linux-x86_64-535.86.05.run on Ubuntu 20.04 for RTX 3060 Linux	9	6173	December 26, 2024
Nvidia-smi no device found on ubuntu 20.04 rtx 3090 Drivers - Linux, Windows, MacOS boot , kernel	2	726	January 8, 2024
Ubuntu - nvidia driver installed but not running Linux	1	734	December 5, 2022
Cannot load drivers in Ubuntu 22.04, kernel 6.10.10-061010-generic Linux ubuntu	1	313	November 8, 2024
Ubuntu 20.04 Nvidia-smi didnt work Linux ubuntu	4	32436	December 31, 2023

Unable to install NVIDIA drivers for 3090 on Ubuntu 20.04

This file was generated by nvidia-prime

Set value to 1 to enable modesetting

Related topics