Nvidia Driver 418 not loading on Ubuntu 18.04.3

info3p8ha · September 13, 2019, 8:55am

I had previously installed nvidia drivers on my 1070 alienware laptop with Ubuntu 18.04 . Suddenly it stopped working. Then I freshly installed Ubuntu and installed Nvidia drivers like this:
sudo apt purge nvidia-*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update

   sudo apt install nvidia-driver-418

Then I couldn’t login. Again I reinstalled with same procedure, but additionally, did this -

added ‘nogpumanager’ kernel parameter
create /etc/X11/xorg.conf

Section “Device”
Identifier “intel”
Driver “modesetting”
BusID “PCI:0:2:0”
EndSection

Still no luck.

Then did this:
sudo grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*
And removed a line blacklisting nvidia from blacklist-framebuffer.conf.
Still no luck.

I then generated error report, and I’m attaching here - nvidia-bug-report.log.gz - Google Drive

The way I’m checking is using this:
nvidia-smi
No devices were found

Please let me know what has happened and how I can rectify. Thanks so much.

generix · September 13, 2019, 9:11am

Not looking good:

[   18.476334] ACPI Error: Field [TMPB] at bit offset/length 1572864/32768 exceeds size of target Buffer (262144 bits) (20181213/dsopcode-201)
[   18.476340] 
               Initialized Local Variables for Method [_ROM]:
[   18.476340]   Local0: 000000007c8f9811 <Obj>           Integer 0000000000030000
[   18.476344]   Local1: 00000000a34eed6d <Obj>           Integer 0000000000001000
[   18.476345]   Local2: 000000005e0402a0 <Obj>           Integer 0000000000180000
[   18.476347]   Local3: 000000007da1c11f <Obj>           Integer 0000000000008000
[   18.476349] Initialized Arguments for Method [_ROM]:  (2 arguments defined for method invocation)
[   18.476349]   Arg0:   00000000ba19d376 <Obj>           Integer 0000000000030000
[   18.476351]   Arg1:   00000000c950ad6d <Obj>           Integer 0000000000001000
[   18.476353] ACPI Error: Method parse/execution failed \_SB.PCI0.PEG0.PEGP._ROM, AE_AML_BUFFER_LIMIT (20181213/psparse-531)
[   18.476365] NVRM: GPU 0000:01:00.0: Failed to copy vbios to system memory.
[   18.476497] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x30:0xffff:707)
[   18.476575] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

Might be defective hardware or a broken bios. Since your bios is quite old, please start by updating it.

info3p8ha · September 13, 2019, 11:28am

Hi Thanks for the answer. I’ll try to update the bios. Meanwhile, is there any technique to check integrity of the hardware? When you say hardware could be defective, are you talking about GPU? I suspect that could be the case as in my previous installation, everything was running smoothly, and one fine day, GPUs were not detected. Please let me know how I can check integrity of hardware.

Thanks so much for your time and insights.

generix · September 13, 2019, 11:58am

In your specific case, this might be a defective system bios flash rom or just some discharged cells which a reflash should fix. For such low-level issues, there’s no software to check since for any such software the drivers have to be loaded which doesn’t work in case of low-level failures.

info3p8ha · September 13, 2019, 1:34pm

Thank you. I understand. I did upgrade bios. But still no luck. I followed the same procedure as earlier. I’m attaching new log. If it’s hardware defect, any idea what I can do?

generix · September 13, 2019, 2:28pm

I’m not really sure anymore, might also just some bios bug that now surfaced with a newer kernel. Can you check if downgrading the kernel to e.g. 4.15 and driver to 390 yields the same result?
Please also run
sudo acpidump >acpidum.txt
and attach the output file.

gruja90 · September 16, 2019, 3:33pm

Hi,

I am writing here because it is related to this topic, and didn’t want to duplicate thread. Maybe this is related also to: /dev/sdb1 : clean, 640729/122388848… and Keyboard is not working - Linux - NVIDIA Developer Forums, how message is same.

I have 2080Ti and just installed fresh Ubuntu 18.04 (kernel version is 5.0.0-27-generic). After i install nvidia drivers which i downloaded from nvidia website (NVIDIA-Linux-x86_64-430.50.run) my Ubuntu stop working. After restart my computer i am getting black screen with message “/dev/sdb2: clean … files …”, and never come to login screen.

Also i tried to follow this tutorial for installation nvidia drivers and cuda: 使用 pip 安装 TensorFlow, but also after i install nvidia drivers (in this case 418) and reboot, Ubuntu stack with same message which i wrote above.

It looks like that there is some incompatibility between newest nvidia drivers and fresh Ubuntu? Anyway i cannot find solution for this problem, only to remove nvidia-drivers but what then?

@info3p8ha Did downgrading kernel to 4.15 help?

Thanks

info3p8ha · September 16, 2019, 4:34pm

@generix,
Thanks for your suggestion and efforts to help. Actually, I didn’t try out your suggestion to downgrade kernel/install driver 390, as I had some work, and needed my machine for other work. So I reinstalled plain ubuntu and ran acpidump. Not sure if it’s useful to diagnose anything, but I’m sharing anyway.

@gruja90,
I can understand frustration… Did you try this?
From ubuntu 18.04+headless_390+intel iGPU after prime-select intel lost contact to GeFORCE 1050ti - Linux - NVIDIA Developer Forums

sudo prime-select nvidia
add ‘nogpumanager’ kernel parameter
create /etc/X11/xorg.conf

Section “Device”
Identifier “intel”
Driver “modesetting”
BusID “PCI:0:2:0”
EndSection
reboot

@generix,
In fresh installation of ubuntu, when I run this -
nvidia-detector, I get
none
But lspci shows
01:00.0 VGA compatible controller: NVIDIA Corporation GP104M [GeForce GTX 1070 Mobile] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)

Also, when I did hardware diagnostics during BIOS startup, it said “Video Card” is fine. You had mentioned there was no way we could see if we check if hardware is intact. My question and my worry is hardware now. I’m still requesting you again to let me know, is there anyway using which I can check if hardware is intact. Because it’s not even one year since I purchased this laptop, so I have warranty still. Any pointers will be helpful.

generix · September 16, 2019, 4:46pm

The easiest method is to install Windows to rule out a kernel/driver bug. For an RMA, the manufacturer will probably request that anyway.

info3p8ha · September 17, 2019, 1:10am

Thanks generix. I did try with Ubuntu 16.04 LTS. I did this:
sudo apt purge nvidia-*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update

sudo apt install nvidia-384

After that, Nvidia-smi is showing ACPI Error: Field [TMPB] at bit offset/length 1572864/32768 exceeds size of target Buffer…

Can we now conclude that it is some hardware issue?

generix · September 17, 2019, 8:29am

Looking at the related acpi code, this only expects values of Arg0 either <0x30000 or >0x30000. In your case, it is called with Arg0=0x30000 so it doesn’t work. The question remains, where the value of 0x30000 comes from. Either the gpu is broken or the mainboard has to be reset, like here: [url]MX150 graphics clock suddenly stuck at 427Mhz (with drivers 390.87 and 430.26) - Linux - NVIDIA Developer Forums
A kernel/driver bug can be safely denied now by testing with the 16.04 setup showing the same issue.
Ultimately, you will have to install Windows to test since Alienware doesn’t support Linux so will not open any support case with it.

gruja90 · September 18, 2019, 7:59am

Hi,

After 1.5 working day spent in research how to fix problem with ubuntu and nvidia drivers, and after i try everything i found without success, i finally found solution. I install ubuntu 18.04LTS instead ubuntu 18.04.3LTS, and now everything working fine. It looks like that this newest version of ubuntu has some problem with nvidia drivers. I think that reason for that is because 18.04.3LTS comming with some kind of open source nvidia drivers, and previous version 18.04LTS not. So in 18.04LTS there is nothing what can make confusion for computer which driver to use. Or can be something in kernel, as i see 18.04.3LTS using kernel-5.0.27 but 18.04LTS using kernel-4.15.0.

I followed different tutorials how to try to disable this open source drivers on these links:

https://medium.com/@antonioszeto/how-to-install-nvidia-driver-on-ubuntu-18-04-7b464bab43e6
Install Tensorflow-GPU to use Nvidia GPU using anaconda on Ubuntu 18.04 / 19.04 do AI! | by Y.C Cheng / 鄭原真 | DataDrivenInvestor
How To Install Nvidia Drivers and CUDA-10.0 for RTX 2080 Ti GPU on Ubuntu-16.04/18.04 | by Achintha Ihalage | Better Programming
NVIDIA GPU, Optimus Prime and Ubuntu 18.04 Woes | by Amitosh Swain Mahapatra | Medium
but nothing helped. Also i tried to downgrade only kernel and didn’t helped. After a lot of different ideas, at the end i have to go to ubuntu archive and get the older version.

Thanks @generix and @info3p8ha anyway!

Topic		Replies	Views
NVidia Driver 415 not loaded on Ubuntu 18.04 Linux	9	17806	May 8, 2019
NVIDIA driver is not loaded. Ubuntu 18.10 Linux	310	129333	February 14, 2024
Cannot install Nvidia driver for RTX 2080 Ti in Ubuntu 18.04/16.04 LTS ! Linux	9	3403	April 17, 2019
Installed driver is not loading returning the message - ERROR: NVIDIA driver is not loaded Linux ubuntu	18	8144	October 12, 2021
ERROR: NVIDIA driver is not loaded - xubuntu 19.04 Linux	4	1427	April 29, 2019
GPU driver doesn't work On Ubuntu 18.04 Linux	3	1781	November 6, 2021
GeForce GT 740M on Ubuntu 18.04 installs but the driver is not being used Linux	13	5664	June 16, 2020
Black screen after install of nvidia driver ubuntu Linux	223	159479	September 14, 2023
GTX1650 (notebook) not working on Ubuntu16.04 (black screen & login loop) Linux	39	6665	October 12, 2021
Ubuntu 18.04 NVIDIA driver not loaded after GCC update Linux	9	1208	October 30, 2022

Nvidia Driver 418 not loading on Ubuntu 18.04.3

Related topics