Ryzen 7 + GTX 1660Ti: blank screen on external outputs in Hybrid graphics mode

Hi,

I have a Lenovo Legion 5 laptop, with a Ryzen 4800H APU and NVIDIA 1660Ti. In this laptop, the internal screen is connected to both NVIDIA and AMD GPU, but all external outputs (USB-C and HDMI) are connected to the NVIDIA GPU only.

Using kernel 5.9.1 and NVIDIA driver 455.23 on Debian 11 (testing).

When selecting Discrete graphics in the BIOS, everything works perfectly - the NVIDIA GPU drives all screens, internal and external.

When selecting Switchable graphics, the NVIDIA-G0 provider is correctly created in Xorg. After connect the provider to the AMD GPU with xrandr --setprovideroutputsource 1 0, the external output is then visible in xrandr. However, all modes result in a blank screen in the external output, using either HDMI or USB-C.

What I have tried:

  • creating multiple xorg configs
  • installing other driver versions, both from the Debian repo and official
  • different monitors on each output

Nothing seems to work. Note that this problem does not exist in Windows, with switchable graphics Windows works just fine, driving the internal display with AMD and the external ones with NVIDIA.

The nvidia-bug-report.sh output is attached. Can you please give me a hand, and let me know how can I solve this?

Thanks!
nvidia-bug-report.log.gz (461.4 KB)

Seems like this I’m not the only one with this problem, see https://forum.manjaro.org/t/black-output-on-monitor-hybrid-amd-nvidia-laptop-with-hdmi-on-dgpu-under-nvidia-450-driver/16822

I don’t know if this helps, but when using Switchable graphics in Windows, the NVIDIA GPU can only see the external output, but not the internal.

Using amdgpu as a display offload source is currently not supported due to an incompatibility between how the amdgpu driver creates “transparent huge pages” without the compound page flag set and how the NVIDIA driver tries to map them. You can see that this is happening in the dmesg log:

[ 49.852234] Unhandled error in __nv_drm_gem_user_memory_handle_vma_fault: -22

Unfortunately, the only current workaround is to recompile the kernel without the CONFIG_TRANSPARENT_HUGEPAGE flag enabled. We’re investigating other ways to work around the problem.

I compiled 5.9.1 with:

CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
# CONFIG_TRANSPARENT_HUGEPAGE is not set

And after setting the provider output source and attempting to bring up the external display I get a kernel crash.
The computer becomes unresponsive, so the best I can get is a photo, which is attached here:

Oh, I think this might be a result of not having this patch: https://patchwork.kernel.org/project/linux-arm-kernel/patch/20200513133245.6408-5-m.szyprowski@samsung.com/ (commit 0552daac2d18fc92c71c94492476b8eb521227e9).

Looks like this didn’t make it into Linux 5.9.1.

I applied the patch and built the kernel. It boots correctly, and the xrandr provider gets added to the xrandr list. However when I use xrandr to change the resolution I get a hard Xorg crash. I think I will wait a bit until this patch hits mainline and report back.

I think you just need the one patch, but I’m not super familiar with it and Alex is on paternity leave so I can’t ask him at the moment.

Hello, I would like to report I have the same problem.
I am seeing the same error in the journal
__nv_drm_gem_user_memory_handle_vma_fault: -22
even after applying the patch metioned and disabling the flag CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE

My computer has a Ryzen 7 4800h CPU and an Nvidia RTX 2060 GPU, the distro is manjaro linux, kernel v5.9.8.
I have requested help in manjaro’s forum too but I am guessing the problem is distro-independent so I tought reporting here was relevant too.

Here is the full error message in the journal using the custom kernel.

Nov 18 16:55:28 legion5P kernel: ------------[ cut here ]------------
Nov 18 16:55:28 legion5P kernel: Unhandled error in __nv_drm_gem_user_memory_handle_vma_fault: -22
Nov 18 16:55:28 legion5P kernel: WARNING: CPU: 10 PID: 1320 at /storage/manjaro/makepkg/linux59-nvidia-455xx/src/NVIDIA-Linux-x86_64-455.45.01-no-compat32/kernel/nvidia-drm/nvidia-drm-gem-u>
Nov 18 16:55:28 legion5P kernel: Modules linked in: ccm rfcomm fuse cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel bluetooth hid_logitech_hidpp ecdh_generic ecc crc16 >
Nov 18 16:55:28 legion5P kernel:  cec rc_core drm agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(POE) crypto_user ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_gen>
Nov 18 16:55:28 legion5P kernel: CPU: 10 PID: 1320 Comm: Xorg Tainted: P           OE     5.9.8-2-MANJARO #1
Nov 18 16:55:28 legion5P kernel: Hardware name: LENOVO 82GU/LNVNB161216, BIOS FSCN09WW 06/28/2020
Nov 18 16:55:28 legion5P kernel: RIP: 0010:__nv_drm_gem_user_memory_handle_vma_fault+0x8c/0x90 [nvidia_drm]
Nov 18 16:55:28 legion5P kernel: Code: 41 bc 00 01 00 00 44 89 e0 41 5c c3 0f 0b 89 c2 48 c7 c6 80 d6 39 c0 48 c7 c7 cb d8 39 c0 c6 05 8d 7b 00 00 01 e8 31 e7 44 f8 <0f> 0b eb cc 0f 1f 44 0>
Nov 18 16:55:28 legion5P kernel: RSP: 0018:ffffb03900cdbb78 EFLAGS: 00010286
Nov 18 16:55:28 legion5P kernel: RAX: 0000000000000000 RBX: ffffb03900cdbbc8 RCX: 0000000000000000
Nov 18 16:55:28 legion5P kernel: RDX: 0000000000000001 RSI: ffffffffb918941a RDI: 00000000ffffffff
Nov 18 16:55:28 legion5P kernel: RBP: ffff9d8871649838 R08: 000000000000050c R09: 0000000000000001
Nov 18 16:55:28 legion5P kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002
Nov 18 16:55:28 legion5P kernel: R13: 0000000000000000 R14: ffff9d8871649838 R15: ffffb03900cdbbc8
Nov 18 16:55:28 legion5P kernel: FS:  00007f338a83c540(0000) GS:ffff9d88af680000(0000) knlGS:0000000000000000
Nov 18 16:55:28 legion5P kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 18 16:55:28 legion5P kernel: CR2: 000055c3b6ac5850 CR3: 00000007c39c8000 CR4: 0000000000350ee0
Nov 18 16:55:28 legion5P kernel: Call Trace:
Nov 18 16:55:28 legion5P kernel:  __do_fault+0x38/0xd0
Nov 18 16:55:28 legion5P kernel:  handle_mm_fault+0x1496/0x1a40
Nov 18 16:55:28 legion5P kernel:  __get_user_pages+0x25f/0x7c0
Nov 18 16:55:28 legion5P kernel:  __gup_longterm_locked+0x61/0x1e0
Nov 18 16:55:28 legion5P kernel:  os_lock_user_pages+0xa5/0x190 [nvidia]
Nov 18 16:55:28 legion5P kernel:  _nv000635rm+0x7a/0xf0 [nvidia]
Nov 18 16:55:28 legion5P kernel:  ? _nv000710rm+0x70c/0x880 [nvidia]
Nov 18 16:55:28 legion5P kernel:  ? _raw_spin_unlock_irqrestore+0x20/0x40
Nov 18 16:55:28 legion5P kernel:  ? rm_ioctl+0x54/0xb0 [nvidia]
Nov 18 16:55:28 legion5P kernel:  ? nvidia_ioctl+0x5b7/0x900 [nvidia]
Nov 18 16:55:28 legion5P kernel:  ? nvidia_frontend_unlocked_ioctl+0x37/0x50 [nvidia]
Nov 18 16:55:28 legion5P kernel:  ? __x64_sys_ioctl+0x83/0xb0
Nov 18 16:55:28 legion5P kernel:  ? do_syscall_64+0x33/0x40
Nov 18 16:55:28 legion5P kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 18 16:55:28 legion5P kernel: ---[ end trace 3bb15a554f2c6903 ]---
Nov 18 16:55:28 legion5P kernel: Cannot map memory with base addr 0x7f3325736000 and size of 0x8ca pages

Disregard the previous message, I managed to fix the issue. I will briefly describe what I did,

In the previous message, I stated the unhandled error persisted with the patched kernel, I was wrong. When I built the kernel, I did not disable the CONFIG_TRANSPARENT_HUGEPAGE flag. Therefore, enabling the flag and applying the patch mentioned by @aplattner fixed the Unhandled error in __nv_drm_gem_user_memory_handle_vma_fault: -22.

However, doing so uncovered other errors.
Firstly, the error AMD-Vi: Unable to read/write to IOMMU perf counter which needs a GRUB flag iommu=soft.
Then, Failed to get backlight or LED device 'backlight:acpi_video0': No such device also needs a GRUB flag acpi_backlight=vendor.

Fixing those revealed an nvidia-gpu i2c timeout error which needed another patch applied to the kernel.
After fixing this last one, my computer booted successfully and HDMI out was working correctly in hybrid mode!

Alright, I hope that is helpful. I tried to stick to the most relevant info here for brevity. I am documenting the full process and will share a link to it once I am done with the write up.

Great news! Looking forward to that write-up @carlosmorales777

I gave up after I ran a few tests on Windows and realised I could get at most 1h / 1h30 extra of battery in Hybrid graphics mode on my Legion 5 with a 60whr battery. This certainly wasn’t worth the extra complexity from having to handle two drivers in Linux, so I changed to Discrete Graphics in the BIOS.

With Discrete graphics I get around 4 hours of battery doing desktop stuff on both Windows and Linux.

How does your Legion 5P behave under Linux? Do you notice lower power consumption in Hybrid graphics mode? Do you get more out of your battery?

Also I appreciate you found a fix for this in Hybrid graphics mode, awesome.

In Discrete graphics mode, create the file /etc/X11/xorg.conf.d/21-nvidia-brightness.conf with the following:
Section “OutputClass”
Identifier “nvidia”
MatchDriver “nvidia-drm”
Driver “nvidia”
Option “RegistryDwords” “EnableBrightnessControl=1;”
EndSection

This will make backlight control work with just the nvidia card.