Fresh laptop, fresh install, ‘nvidia-smi’ gives me ‘no devices were found’.
Please help me out.
Ubuntu 19.04 (pop os)
vendor : NVIDIA Corporation
model : GP107M [GeForce GTX 1050 Mobile]
nvidia-bug-report.log.gz (586 KB)
Fresh laptop, fresh install, ‘nvidia-smi’ gives me ‘no devices were found’.
Please help me out.
Ubuntu 19.04 (pop os)
vendor : NVIDIA Corporation
model : GP107M [GeForce GTX 1050 Mobile]
nvidia-bug-report.log.gz (586 KB)
You’re runnng into this:
Jul 10 20:24:31 pop-os kernel: NVRM: RmInitAdapter failed! (0x31:0xffff:834)
Jul 10 20:24:31 pop-os kernel: NVRM: rm_init_adapter failed for device bearing minor number 0
Might be defective hardware or a bios incompatibility. Check for a bios update, check if it works in Windows. If both fails, RMA.
Same issue here.
Ubuntu 19.10
GeForce 930M
I saw a few errors while accessing hard drive,which persisted after switching drive, hence concluded the SATA port is at fault. I mounted my SSD on DVD bay instead.
Now I am not able to use my GPU.
Can upload full output of nvidia-bug-report.sh, but here are a few extracts i found relevant:
*** /var/log/Xorg.0.log.old
...
[ 34.421] (II) Loading sub module "glxserver_nvidia"
[ 34.421] (II) LoadModule: "glxserver_nvidia"
[ 34.421] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so
[ 34.437] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[ 34.437] compiled for 1.6.99.901, module version = 1.0.0
[ 34.437] Module class: X.Org Server Extension
[ 34.437] (II) NVIDIA GLX Module 435.21 Sun Aug 25 08:14:27 CDT 2019
[ 34.437] (II) NVIDIA: The X server supports PRIME Render Offload.
[ 34.695] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please
[ 34.695] (EE) NVIDIA(GPU-0): check your system's kernel log for additional error
[ 34.695] (EE) NVIDIA(GPU-0): messages and refer to Chapter 8: Common Problems in the
[ 34.695] (EE) NVIDIA(GPU-0): README for additional information.
[ 34.695] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA graphics device!
[ 34.695] (EE) NVIDIA(0): Failing initialization of X screen
[ 34.695] (II) UnloadModule: "nvidia"
[ 34.695] (II) UnloadSubModule: "glxserver_nvidia"
[ 34.695] (II) Unloading glxserver_nvidia
[ 34.695] (II) UnloadSubModule: "wfb"
[ 34.695] (II) UnloadSubModule: "fb"
/var/log/kern.log:
Mar 22 15:56:17 deshant-X556UF kernel: [ 250.856221] NVRM: Failed to enable MSI; falling back to PCIe virtual-wire interrupts.
Mar 22 15:56:25 deshant-X556UF kernel: [ 258.932767] NVRM: RmInitAdapter failed! (0x26:0x65:1106)
Mar 22 15:56:25 deshant-X556UF kernel: [ 258.932823] NVRM: rm_init_adapter failed for device bearing minor number 0
Mar 22 17:39:07 deshant-X556UF kernel: [ 8.311663] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
Mar 22 17:39:07 deshant-X556UF kernel: [ 8.587065] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 418.56 Fri Mar 15 12:59:26 CDT 2019
Mar 22 17:39:07 deshant-X556UF kernel: [ 8.670630] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 418.56 Fri Mar 15 12:32:40 CDT 2019
Mar 22 17:39:07 deshant-X556UF kernel: [ 8.784108] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Mar 22 17:39:07 deshant-X556UF kernel: [ 8.784111] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
journalctl -b -1:
Mar 25 17:13:10 deshant-X556UF kernel: NVRM: GPU 0000:01:00.0: Failed to enable MSI; falling back to PCIe virtual-wire interrupts.
Mar 25 17:13:10 deshant-X556UF kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1226)
Mar 25 17:13:10 deshant-X556UF kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Mar 25 17:13:10 deshant-X556UF systemd-udevd[3446]: Process '/usr/bin/nvidia-smi' failed with exit code 6.
In my GRUB I am setting flag: pci=nomsi
otherwise I am not able to boot properly: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nomsi"
Thanks in advance.
If you have to set pci=nomsi to boot at all, something’s really broken, either bios or hardware. Modern systems won’t properly work without msi. See if a bios update is available for your notebook. If not, please try to get a complete dmesg of a boot without pci=nomsi set. Maybe the broken component can be identified and disabled then.
“Unable to boot properly” means that once the Ubuntu splash goes, I see a LOT of errors scrolling down in matrix effect, if nomsi is missing. And it stays that way. Those are the same/similar errors I posted in kern.log snippet.
BIOS update is probably available but I currently don’t have Windows installation. Will try to attempt a boot without nomsi flag and log the errors. However I did try once just before running the bug report sh. (snippets in previous reply)
Since it’s a complete lockdown here I am not willing to experiment with the bios of the only computer available to me currently. However can try experiments with OS.
To add to my previous post, my laptop has a Hard drive controller daughter board with one sata and one mini sata port. This board connects to the motherboard. The incident with the sata port had me switch to DVD Bay for mounting my SSD.
[Nvidia drivers not installed.]
[Update: Interestingly, sometime in the past (probably around the same time I switched SATA ports) i had a few kernel modules disabled (don’t remember why) using /etc/modprobe.d/blacklist.conf: vga16fb, nouveau, rivafb, nvidiafb, rivatv ]
Experiment 1: replaced nomsi with noaer so it will ignore errors and not disable MSI. Boot was success. Login success. Checked dmesg and as expected nothing jumped out.
Experiment 2: removed pci flag from grub altogether. Boot success. Login failed. Was stuck after entering login password. Had to force shutdown (long press power button).
dmesg.0.log (2.0 MB)
Rebooted using pci=noaer
flag. Above is dmesg of last boot [dmesg.0.log].
seems like I can manage with pci=noaer
flag instead of nomsi
.
@generix Thanks again for your time, will be waiting for your inputs.
By setting pci=nomsi you’ve fallen for a fallacy. I doesn’t fix anything but it also just hides the errors since it includes pci=noaer. The same goes for pci=nommconf. Don’t use them, only pci=noaer.
Looking at the error, it looks like those errors come from a (partially) broken wifi-adapter (RTL8723BE). Does that work at all? Can you disable it for testing and check if the system comes up correctly without pci=noaer?
Searching for that RTL8723BE told me that it is broken by design and always generates that error flood. No way around it but setting pci=noaer.
So since that is settled, please install the nvidia drivers so I can have a look at that.
Yes, WiFi works well, since you are saying the device is flooding the logs with errors, I have included the module in blacklist so it doesn’t load on startup: /etc/modprobe.d/blacklist.conf
now contains blacklist rtl8723be
at the end.
grub now only contains pci=noaer
. Installed nvidia drivers v390 (felt should go with something stable instead of lastest)
nvidia-smi says NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Tried booting with prime profile selected as nvidia. After Ubuntu splash, did not proceed to login screen. Tried gaining alternate tty using ctrl+alt+f2/f3/f4 but it forces back to black screen with following line and blinking underscore. Only option left was to force shutdown. Had to get root shell from recovery mode and switch to intel profile.
The lines on black screen say:
/dev/sda5: clean, 619784/2457600 files, 7058613/9828096 blocks
_
My guess, above lines are unrelated, while in background nvidia driver is trying to start up and render something but it fails, hence nothing works. While taking root shell, weird logs were written over the interface like stopping nvidia persistance daemon etc.
Attaching bug report output after successful boot using intel profile.
nvidia-bug-report.log (1.0 MB)
Looks like it’s cleanly booting now, yet:
kernel: NVRM: RmInitAdapter failed! (0x26:0x65:1123)
most likely, the gpu is simply broken.
Ah, was afraid of the same. Sometime in the future will try to check again after installing Windows and updating BIOS. Thanks a lot though, it was a nice learning experience.