Hello,
I tried to install cuda 10 and now the GPU doesn’t work anymore on Ubuntu 18:
$nvidia-smi
No devices were found
however the card itself is detected:
$hwinfo --gfxcard --short
graphics card:
nVidia VGA compatible controller
Primary display adapter: #17
and
$ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 Jun 10 17:01 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Jun 10 17:01 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Jun 10 17:01 /dev/nvidia-modeset
crw-rw-rw- 1 root root 511, 0 Jun 10 17:01 /dev/nvidia-uvm
crw-rw-rw- 1 root root 511, 1 Jun 10 17:01 /dev/nvidia-uvm-tools
/dev/nvidia-caps:
total 0
cr-------- 1 root root 237, 1 Jun 10 17:05 nvidia-cap1
cr--r--r-- 1 root root 237, 2 Jun 10 17:05 nvidia-cap2
lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2486 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
$lsmod | grep nvidia
nvidia_uvm 995328 0
nvidia_drm 53248 0
nvidia_modeset 1179648 1 nvidia_drm
nvidia 34877440 6 nvidia_uvm,nvidia_modeset
drm_kms_helper 172032 1 nvidia_drm
drm 401408 3 drm_kms_helper,nvidia_drm
nvidiafb 49152 0
vgastate 24576 1 nvidiafb
fb_ddc 16384 1 nvidiafb
i2c_algo_bit 16384 3 nvidiafb,igb,pcan
$modinfo nvidia
filename: /lib/modules/4.15.0-144-generic/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 465.27
supported: external
license: NVIDIA
firmware: nvidia/465.27/gsp.bin
srcversion: 39917A290FD064ADAD7F741
alias: pci:v000010DEd*sv*sd*bc03sc02i00*
alias: pci:v000010DEd*sv*sd*bc03sc00i00*
depends:
retpoline: Y
name: nvidia
vermagic: 4.15.0-144-generic SMP mod_unload modversions
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4
parm: NvSwitchRegDwords:NvSwitch regkey (charp)
parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp)
parm: nv_cap_enable_devfs:Enable (1) or disable (0) nv-caps devfs support. Default: 1 (int)
parm: NVreg_ResmanDebugLevel:int
parm: NVreg_RmLogonRC:int
parm: NVreg_ModifyDeviceFiles:int
parm: NVreg_DeviceFileUID:int
parm: NVreg_DeviceFileGID:int
parm: NVreg_DeviceFileMode:int
parm: NVreg_InitializeSystemMemoryAllocations:int
parm: NVreg_UsePageAttributeTable:int
parm: NVreg_RegisterForACPIEvents:int
parm: NVreg_EnablePCIeGen3:int
parm: NVreg_EnableMSI:int
parm: NVreg_TCEBypassMode:int
parm: NVreg_EnableStreamMemOPs:int
parm: NVreg_RestrictProfilingToAdminUsers:int
parm: NVreg_PreserveVideoMemoryAllocations:int
parm: NVreg_EnableS0ixPowerManagement:int
parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int
parm: NVreg_DynamicPowerManagement:int
parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int
parm: NVreg_EnableGpuFirmware:int
parm: NVreg_EnableUserNUMAManagement:int
parm: NVreg_MemoryPoolSize:int
parm: NVreg_KMallocHeapMaxSize:int
parm: NVreg_VMallocHeapMaxSize:int
parm: NVreg_IgnoreMMIOCheck:int
parm: NVreg_NvLinkDisable:int
parm: NVreg_EnablePCIERelaxedOrderingMode:int
parm: NVreg_RegisterPCIDriver:int
parm: NVreg_RegistryDwords:charp
parm: NVreg_RegistryDwordsPerDevice:charp
parm: NVreg_RmMsg:charp
parm: NVreg_GpuBlacklist:charp
parm: NVreg_TemporaryFilePath:charp
parm: NVreg_ExcludedGpus:charp
parm: rm_firmware_active:charp
Also sudo nvidia-smi
gives the same output. I already purged everything with:
sudo apt-get purge nvidia-*
sudo apt-get purge cuda-*
and then checked for the recommended driver:
ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00002486sv00001462sd0000C972bc03sc00i00
vendor : NVIDIA Corporation
manual_install: True
driver : nvidia-driver-465 - third-party free recommended
driver : nvidia-driver-460 - third-party free
driver : nvidia-driver-460-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
and installed nvidia-driver-465
with sudo apt-get intsall nvidia-driver-465
. After rebooting it does not work.
I noticed an issue here:
dmesg | grep NVRM
[ 14.844375] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 465.27 Thu Apr 22 23:21:03 UTC 2021
[ 16.317381] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1242)
[ 16.317475] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 16.365837] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1200)
[ 16.365885] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 16.923661] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1200)
[ 16.923695] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 16.950698] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1200)
[ 16.950741] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
What can I do here?