Can I manually install kernel 5.8 ?
How to not create an xorg.conf file?
You can also revert anything you did
- uninstall the runfile driver
- delete /etc/X11/xorg.conf
- remove all kernels you manually installed
and then update ubuntu. You should end up at kernel 5.8 automatically.
It did work perfectly fine. But after 2 or 3 reboots, the problem has returned. I am unable to boot again. If I boot by removing quiet splash from kernel parameters and set gfxmode to text, it displays āfailed to start load/save screen backlight brightness of acpi_video0ā and then continues and after all messages it gets stuck in a black screen with underscore(non-blinking and Ctrl+Alt+F1 do nothing) at top left corner. Should I share my /var/log files and nvidia-bug-report.log.gz
Also thanks for fast reply.
try:
sudo apt install --install-recommends linux-generic-hwe-20.04
Hello,
I have same problemā¦
āNVIDIA-SMI has failed because it couldnāt communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.ā
I use ubuntu 20.04 with GeForce RTXā¢ 3060 Laptop GPU.
$ uname -a
Linux xxx 5.8.0-44-generic #50~20.04.1-Ubuntu SMP Wed Feb 10 21:07:30 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
I installed nvidia-driver-460.
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ ubuntu-drivers devices
$ sudo apt install nvidia-driver-460
After reboot,
$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Here is nvidia-bug-report.log.gz (194.7 KB) .
Thank you in advance.
Please set the kernel parameter
pci=realloc
I editted GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub.
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=realloc"
and
sudo update-grub
sudo reboot
After that, nvidia-smi worked fine.
Thank you very much!!
I am atggreg012. It did work perfectly fine. But after 2 or 3 reboots, the problem has returned. I am unable to boot again. If I boot by removing quiet splash from kernel parameters and set gfxmode to text, it displays āfailed to start load/save screen backlight brightness of acpi_video0ā and then continues and after all messages it gets stuck in a black screen with underscore(non-blinking and Ctrl+Alt+F1 do nothing) at top left corner. Should I share my /var/log files and nvidia-bug-report.log.gz
Also thanks for fast reply.
Please open a new thread to avoid not being able to post and attch a new nvidia-bug-report.log.
I did the following as you said:
- uninstall the runfile driver
- delete /etc/X11/xorg.conf
- remove all kernels you manually installed
and then update ubuntu. You should end up at kernel 5.8 automatically.
It did work perfectly fine. But after 2 or 3 reboots, I am unable to boot again. If I boot by removing quiet splash from kernel parameters and set gfxmode to text, it displays āfailed to start load/save screen backlight brightness of acpi_video0ā and then continues and after all messages it gets stuck in a black screen with underscore(non-blinking and Ctrl+Alt+F1 do nothing) at top left corner.
nvidia-bug-report.log.gz (265.9 KB)
This boot you booted to kernel 5.11 with ānomodesetā kernel parameter set (might be due to recovery mode), before that, you booted into the 5.4 kernel (which obviously doesnāt work. Please make sure you boot into the 5.11 kernel and donāt have ānomodesetā set.
I was in recovery mode to create nvidia-bug-report.log.gz. I do not have nomodeset kernel parameter set otherwise. It normally launches with ubuntu spinning circle appearing for a moment and then the screen freezes to ubuntu and asus logo.
I have issue with kernel 5.11 and 5.8 not booting ( which were working perfectly before 3 reboots ). Other kernels I have are 5.4 and 5.7 which are booting fine but have issue with nvidia-smi and nvidia-setting.
After this post I probably wonāt be able to reply back due to limit. How shall I reply? Can I somehow continue the progress if I create new thread.
Here I am posting nvidia-bug-report.log file while in kernel 5.4 in normal mode.nvidia-bug-report.log.gz (132.8 KB)
Please delete /etc/X11/xorg.conf and boot into kernel 5.11.
I have the same issue that were originally reported in this thread. āNVIDIA-SMI has failed because it couldnāt communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.ā
Ubuntu release: 20.10
Kernel: 5.8.0-44-generic
Nvidia driver installed: 460
Computer: Asus ROG zephyrus with Nvidia 1660 TI
Problem is that everything worked perfectly and I could switch between AMD GPU and Nvidia GPU. Suddenly it stopped working, and now I canāt seee the nvidia card when listing pci devices using lspci.
- I have tried booting the computer with ubuntu 20.04 and that works as normal. The nvidia card is listed in lspci.
- I have tried to set pci=realloc
It seems to me that somehow the system fails to detect the nvidia card. Any suggestions or is a full reinstall of linux the only remedy? Btw; I am pretty new to Linux and Iām trying to learn - so please go easy on me.
Please check for an udev rule that removes the nvidia card:
grep 10de /lib/udev/rules.d/*
and remove it.
I found a .rules for nvidia that referred to 10de. I removed that file and rebooted. The rule file is still gone after reboot, but the problem persists, Iām afraid.
You might have to update the initrd:
sudo update-initramfs -u
and reboot.
If that still doesnāt help, please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.
Hi,
Didnāt seem to do the trick, Iām afraid. I have uploaded the full bug report.
nvidia-bug-report.log.gz (113.9 KB)
Thanks!
Update:
This is the reply to below request (Sorry, Iām only allowed 3 responses it seems):
Output of dpkg -l |grep ubuntu-drivers-common:
ii ubuntu-drivers-common 1:0.8.6.3~0.20.10.1 amd64 Detect and install additional Ubuntu driver packages
Update:
Output of dpkg -l |grep nvidia-prime:
ii nvidia-prime 0.8.16~0.20.10.1 all Tools to enable NVIDIA's Prime
So the version should be high enough for that fix it seems. I did an update of initrd as well just in case, but the problem persists unfortunately.
The udev rules to remove the nvidia gpu get recreated by Ubuntuās gpu-manager:
[ 6.747140] pci 0000:01:00.2: Removing from iommu group 8
[ 6.747919] pci 0000:01:00.0: Removing from iommu group 8
[ 6.748202] pci 0000:01:00.3: Removing from iommu group 8
[ 6.748629] pci 0000:01:00.1: Removing from iommu group 8
This was a bug in gpu-manager that should have been fixed, likely you just need to update your system to get the fixed version. Please post the output of
dpkg -l |grep ubuntu-drivers-common
Iāve checked package details and the pm rules belong to the package nvidia-prime. It was fixed in version 8.15.3, the current for 20.10 should be 8.16. Please check
dpkg -l |grep nvidia-prime
and update if itās a lower version. Also, updating the initrd for the running kernel might be necessary incase youāre not running the latest:
sudo update-initramfs -u -k $(uname -r)