Ubuntu MATE 20.04 with RTX 3070 on Ryzen 5900 - black screen after boot

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

After I replied your post with nvidia-bug-report.log.gz, I rebooted my system and everything works fine. Thanks you so much for the help.
Now, I feel so strong flowing with Linux OS after so much longing of getting off windows.

I think I have somthing simillar but can’t resolve this issue. I have a ryzenn 5800 and laptop rtx 3070. with kernel 5.11.0 or 5.10.x, if I start withour the nomodeset option, the system can’t boot, freezing on a black screen (impossible to switch to another console ctrl+alt+F2 …). I tried different drivers, ie, 460.59 but without success… and followed each step in the solution. Now I can boot the system without the nomodeset option in grub, but, ā€œNVIDIA-SMI has failed because it couldn’t communicateā€. The drivers 460.39 are installed since the beginning… nvidia-bug-report.log.gz (76.5 KB) Do you have an idea ?

Please delete /etc/X11/xorg.conf, it disables the internal screen. Then you installed the driver using the runfile and without dkms so it only built for kernel 5.11. Please uninstall the runfile driver and reinstall the repo driver again.

Yes I realised my system was a bit dirty with all theses attempts. I reinstalled a fresh 20.04, followed all steps in the post marked as the solution. This time there is no /etc/X11/xorg.conf, no 5.11. The drivers were installed with de depo. The system start, but, nvidia-smi says the driver is not loaded. I’m still not using my 3070 card. nvidia-bug-report.log.gz (81.3 KB). Dirver is 5.10.14. System is the new A15 Asus TUF 566QR with AMD 5800 and 3070.

dpkg -l |grep nvidia-prime*
ii nvidia-prime 0.8.15.3~0.20.04.1 all Tools to enable NVIDIA’s Prime
dpkg -l |grep ubuntu-drivers-common
ii ubuntu-drivers-common 1:0.8.6.5~0.20.04.1 amd64 Detect and install additional Ubuntu driver packages
glxinfo |grep OpenGL
OpenGL vendor string: X.Org
OpenGL renderer string: AMD RENOIR (DRM 3.40.0, 5.10.14-051014-generic, LLVM 11.0.0)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 20.2.6
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6 (Compatibility Profile) Mesa 20.2.6
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 20.2.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:

nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

You either failed to execute step 3) (fetching a file from kernel git) correctly or the driver was installed before that. In the latter case, run
sudo dkms install nvidia/460.39
and reboot. check driver status using
dkms status

I can’t figure if I missed something.
EDIT
OMG, I did so many times the same mistake, when replacing in the path with the real kernel version, to add ā€œ-genericā€ when the autocompletion give the …/5.10.14/ , my mystake, sorry for that. It seems to work now :

nvidia-smi
Fri Feb 26 12:15:30 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 307… Off | 00000000:01:00.0 Off | N/A |
| N/A 52C P8 11W / N/A | 70MiB / 7982MiB | 6% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1008 G /usr/lib/xorg/Xorg 69MiB |
±----------------------------------------------------------------------------+

Question, could I install cuda toolkit with this process (as I need 11.2) or will it kill my good driver ^^ :

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pinsudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600wget https://developer.download.nvidia.com/compute/cuda/11.2.1/local_installers/cuda-repo-ubuntu2004-11-2-local_11.2.1-460.32.03-1_amd64.debsudo dpkg -i cuda-repo-ubuntu2004-11-2-local_11.2.1-460.32.03-1_amd64.debsudo apt-key add /var/cuda-repo-ubuntu2004-11-2-local/7fa2af80.pubsudo apt-get updatesudo apt-get -y install cuda

Please check for existance/post the contents/attach
/usr/src/linux-headers-5.10.14-051014-generic/scripts/module.lds

I’m trying to replicate predte4a’s success on my 2021 Asus ROG Zephyrus G15 (Ryzen 5900HS + 3070). I can get through his script and get things installed (seemingly) correctly, but they don’t work once I switch between GPUs.

I am able to install everything on a clean install of Linux Mint 20.1 based on his summarized instructions, using kernel 5.11.4, the green sardine Renoir drivers, and Nvidia’s 4.60.39 drivers. Nvidia graphics work for the initial session.

However, when I use either the taskbar widget or nvidia-settings to switch away from Nvidia graphics to AMD Renoir graphics, after the required restart my taskbar widget is gone an nvidia-settings refuses to run because no driver is loaded. (or, more accurately, it runs but gives an error)

Once switched to integrated (AMD) graphics, things run fine, but I can’t figure out any way to switch back. My impression is that switching away from Nvidia somehow is breaking something for the Nvidia drivers. Subsequent purges+reinstalls of the 5.11.4 kernel and/or the Nvidia drivers fail to ever get the Nvidia drivers back in control – I only get working, active drivers for the initial install on a clean distro.

Please help!

nvidia-bug-report.log.gz (106.9 KB)

try running
sudo prime-select nvidia

please post the output of
dpkg -l |grep nvidia-prime
and
dpkg -l |grep ubuntu-drivers-common

sudo prime-select nvidia returns:

Info: the nvidia profile is already set

For the other commands:

ii  nvidia-prime                               0.8.16~0.20.04.1                      all          Tools to enable NVIDIA's Prime
ii  nvidia-prime-applet                        1.2.6                                 all          An applet for NVIDIA Prime
ii  ubuntu-drivers-common                      1:0.8.6.5~0.20.04.1                   amd64        Detect and install additional Ubuntu driver packages

Looks like you have this installed
https://gitlab.com/asus-linux/asus-nb-ctrl
which is disabling ubuntu’s gpu-manager. Please set ā€œmanage_gfxā€: false in its config as mentioned in the readme.

Ok, by setting "manage_gfx": false in /etc/asusd/asusd.conf and then running sudo nvidia-prime intel (to convince prime I was no longer using nvidia graphics) and then sudo nvidia-prime nvidia I am able to get back to using my Nvidia GPU.

Thank you for your help!

(Now for some reason asus-nb-ctrl has stopped working with my keyboard, but I’ll take that up back on the asus-nb-ctrl Discord.)

I’ve also (big surprise) run into this. I’m running Debian bullseye with kernel 5.10.0-4-amd64. It’s an ASUS Rog Strix 15" machine with Ryzen 7 5800H (amdgpu) and Nvidia RTX 3080. Following this guide I’ve gotten as far as getting the AMD GPU working, system info now lists it as ā€œAdvanced Micro Devices, Inc[AMD/ATI] Cezanneā€ , by downloading the green_sardine firmware and removing nomodeset from kernel params. Unfortunately when I install the nvidia-driver package, the black screen hits. I’m able to change virtual consoles and run commands, nvidia-smi seems to be working (reports correct GPU and such).

What I did different from everyone else was I copied the module.lds script to linux-headers-5.10.0-4-amd64/ which is te headers folder corresponding to my running kernel.

Removing nvidia-* and libnvidia* packages and reinstalling yields the same result.
Any suggestions ? Thanks in advance
Adrian
nvidia-bug-report.log.gz (298.5 KB)

Since debian doesn’t have any gpu switching infrastructure, you need to set this up manually:
https://wiki.debian.org/NVIDIA%20Optimus#Using_NVIDIA_GPU_as_the_primary_GPU
You already set the nvidia gpu as primary, so I guess the xrandr command to enable the internal screen are missing. NB: you’re using the amdgpu driver instead of modesetting so the provider names differ.

Hi! Thanks for your solution on this topic, I could finally install the Nvidia and AMD drivers with it (I have an Asus Zephyrus G15 with 5900HS and RTX-3070, with Ubuntu 20.04 and kernel 5.10.23). I configured it to work On-Demand (supposedly) but even when doing just web browsing or working on a terminal the Nvidia card seems to be working (and consuming quite a bit of power, 20W, keeping the bottom quite warm and reducing the battery time). How can I do to make it use only the iGPU for the normal applications and keep the Nvidia active but not used until it’s needed? (What I originally thought the On-Demand was supposed to do…)

This is what I see with nvidia-smi:

±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56 Driver Version: 460.56 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 307… Off | 00000000:01:00.0 Off | N/A |
| N/A 48C P0 20W / N/A | 10MiB / 7982MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1084 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1946 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+

When I check, it seems both graphic cards are detected and properly configured:

$ sudo lshw -c video
*-display
description: VGA compatible controller
product: NVIDIA Corporation
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
logical name: /dev/fb0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=nvidia latency=0 mode=2560x1440 visual=truecolor xres=2560 yres=1440
resources: iomemory:fc0-fbf iomemory:fe0-fdf irq:118 memory:fb000000-fbffffff memory:fc00000000-fdffffffff memory:fe00000000-fe01ffffff ioport:e000(size=128) memory:fc000000-fc07ffff
*-display
description: VGA compatible controller
product: Advanced Micro Devices, Inc. [AMD/ATI]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:07:00.0
version: c4
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi msix vga_controller bus_master cap_list
configuration: driver=amdgpu latency=0
resources: iomemory:fe0-fdf iomemory:fe0-fdf irq:55 memory:fe10000000-fe1fffffff memory:fe20000000-fe201fffff ioport:c000(size=256) memory:fc500000-fc57ffff

$ lspci -nn | grep -E ā€˜VGA|Display’
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:249d] (rev a1)
07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1638] (rev c4)

Thanks in advance!

AFAIK, full runtime power management doesn’t yet work on AMD/Nvidia combos but power consumption shouldn’t be that high and it shouldn’t stay at P0. Please use nvidia-setting’s powermizer pane to check if the gpu is ever clocking down to performance level 0.

I just checked, after leaving it on the whole night idling and the nvidia-smi says it’s at P8 (not sure what P8 is, given that in Nvidia Settings I only see P0 to P3 as possibles) with 12W, but the Powermizer says it’s staying at P0.

I understand if it’s still not working, I’ll have to manually set it to ā€œintelā€ and reboot to minimize the power consumption when I don’t need the Nvidia card, but if you have any ideas that I can try, I’ll really appreciate it.

You can’t use nvidia-smi to check idle power, it will put some load on the gpu so you will get some random value displayed. Just use nvidia-settings to confirm it stays at PL0 and clocks at minimum. To have the nvidia gpu turn off , you’ll currently have to switch to ā€˜intel’ mode on AMD/Nvidia.

Oh, I see… Well, yes, it stays at PL0 with clocks at minimum while it’s in idle.

Well, it is what it is, I’ll keep switching manually to ā€˜intel’ to save power.

Thanks anyways!