Hello everyone,
Trying to install a rather old GPU to a new PC (solely to take advantage of the CUDA toolkit). These are the hardware specs of my system and the software specs currently running, for which the attached nvidia-bug-report.log.gz was generated:
- Motherboard Model: Gigabyte Β650 GAMING Χ ΑΧ V2
- GPU Model: MSI GeForce RTX 3090 24GB GDDR6X Suprim X (GA102)
- Driver Version: 535.230.02
- Kernel Version: 5.15.0-140-generic
- OS: Ubuntu 22.04 LTS Server
However, although nvidia driver installation seems to proceed as expected, nvidia-smi command keeps returning No devices were found
Drivers I have tried so far in both Ubuntu Server 24.04 and 22.04:
- nvidia-driver-570
- nvidia-driver-570-server
- nvidia-driver-570-server-open
- nvidia-driver-560
- nvidia-driver-550
- nvidia-driver-535
My BIOS settings are:
- Updated to the latest version
- Secure Boot: Disabled
- Above 4G Decoding: Enabled
- Re-Size BAR Support: Enabled
I tried the solutions suggested in the following posts (regarding blacklisting nouveau driver and enabling config of unsupported GPUs for open-source drivers), which allow for Ubuntu to boot normally but the problem persists:
- SOLVED - RmInitAdapter failed! to load 530.41.03 (or any nvidia modules other than 450.236.01) Linux via ESXi 7.0u3 Passthrough PCI GTX 1650
- Nvidia-smi “No devices were found” - VMWare ESXI Ubuntu Server 20.04.03 with RTX3070
Τhe driver is correctly installed. The card is detected on the PCIe bus (Gen 4). Nevertheless, the system reports:
- nvidia-smi: No devices were found
- nvidia-debugdump: Found 0 NVIDIA devices
- Kernel logs: nvidia-drm fails to allocate the KMS interface
- Device node /dev/nvidia0 is present, indicating partial driver initialization
- GPU appears in lspci and lshw with the correct vendor ID
Verified Working Components
Component | Status | |
---|---|---|
PCIe Detection | Detected at 01:00.0 | |
NVIDIA Driver Installed | Version 535.230.02, via DKMS | |
Kernel Headers | Match installed kernel | |
/dev/nvidia* nodes | Present | |
nvidia-persistenced | Running normally | |
lshw Driver Bind | Shows driver=nvidia |
Failing Components
Component | Status Detail |
---|---|
nvidia-smi | Reports “No devices were found” |
nvidia-debugdump --list | Reports 0 devices |
KMS Device Init | Fails to allocate NvKmsKapiDevice |
GPU Compute/Display Engines | Not responsive to driver |
Key Logs from /var/log/syslog
- NVRM: kgspExtractVbiosFromRom_TU102: did not find valid rom signature
- NVRM: kgspInitRm_IMPL: failed to extract VBIOS images from ROM: 0x25
- NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x62:0x25:1859)
- NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Key Logs from dmesg
- [drm:nv_drm_load [nvidia_drm]] ERROR Failed to allocate NvKmsKapiDevice
- [drm:nv_drm_probe_devices [nvidia_drm]] ERROR Failed to register device
- nvidia-debugdump --list → Found 0 NVIDIA devices
These logs seem to indicate that driver binding occurs, but the GPU’s internal engines (compute/display) are not responding.
Moreover, I installed Windows 10 OS on this system and tried to install the suggested Nvidia driver through the Nvidia App but still the driver could not communicate with the GPU.
I would appreciate any help to solve this issue! You can find attached my bug report and the output of several diagnostic commands as well.
nvidia_diagnostics_20250522_103045.log (6.1 KB)
nvidia-bug-report.log.gz (140.8 KB)