External monitor freezes when using dedicated GPU

It’s not pure hardware issue, I suppose this issue is produced by invalid hardware/software interaction, which causes some kind of locking in internal kernel logic, f.e. in DMA-BUF buffers management and/or synchronization. I’m new to the Linux GPU driver development, but I’ve found very comprehensive guide about this topic - Linux GPU Driver Developer’s Guide — The Linux Kernel documentation
I hope to find a way to trace lockups (freezes) in described there PRIME buffer sharing and synchronization.

My understanding is that this tools are intended for Data Center GPUs, not for those used in laptops/desktops. May be that is why PCIe and memory diagnostics are skipped.

2 Likes

I ran into this problem as well when I updated to Ubuntu 22.04 and installed nvidia driver 545. If I plugged in any external monitor, everything would freeze after 5-10 seconds and I would need to do a hard reboot. I tried to switch to Performance mode as suggested in this thread, but that did not help.

Ultimately, I just reverted back to Ubuntu 20.04, CUDA 11.7, and nvidia-driver 515. Everything is working again as it should. I plan to continue to monitor progress here and not update until I see that this is resolved.

yes, as mentioned here, with 515.105.01 I’m as well not able to reproduce freeze on my Debian.
Also, when resizing glgears on newer drivers the picture renders slower and needs to catch up with my resizing, while with 515 it is visually rendered “on time”.

…anyway, downgrading is an awful solution, and i wish Santa will bring poor Nvidia Corporation one of those cheap old laptops that we’ve mentioned in this topic, so they can reproduce issue and fix what they broke.

1 Like

Unfortunatly it doesnt support RTX 4070 Mobile or any 4070 :(

1 Like

Actually, we should start a fundraising campaign. This poor, small startup called NVIDIA needs our help. Maybe in a few years, it will grow larger and might even become strong in AI so it could use more cutting-edge tools to tackle problems that are incredibly hard to solve now. That day we can be sure we will pay for its value, and that any small user issue is handled appropriately… oh what a bright future!

4 Likes

@simpson38 Yeh, no 4xxx series at 515 indeed.
@Spange reported that he downgraded to 510, but according to his bug report he is on 4070, like you.
His bug report looks strange to me because he ended up his downgrading with mix of 535.113.01 and 525.125.06 somehow… and you started with 535.104.05, according to your bug report… so that’s some dark magic… did you try those two versions or to reproduce his steps exactly?

1 Like

Yes, I’m on 4070.
I am ish happy now using
Performance mode in nvidia-settings
What makes me not so happy is that my brand new laptop has a battery capacity of ~45 minutes when I’m in a Teams meeting running absolutely nothing else but my web browser.

1 Like

Hey,

I had the same problem recently with an RTX 4070, it seems that disabling Chrome’s hardware acceleration fixes the problem but causes a big drop in performance.

It might also be worth disabling integrated graphics. This seems to have a positive effect on my side. After that, I was able to reactivate hardware acceleration and no freeze was to be seen.

Graphics should show only “NVIDIA Corporation”.
image

I’m on 535.129.03-0ubuntu0.22.04.1 amd64

Hope that can be helpfull,

Best regards,
David

2 Likes

I have this problem with my 3070 laptop but only in xorg running i3, gnome wayland works fine
But i want to run i3

If i use any application using prime there is a high chance the montior that is connected to the 3070 will freeze/get stuck doesnt crash i3 though the laptop montior doesnt freeze. Please fix

1 Like

Hi Sam,
I’m on i3.
I’m on an MSI Katana 15 B13V (i9-13900H, RTX 4070) and I’ve replaced the original 16GB RAM with 2x32 (DDR5 SDRAM, SO DIMM 262 5600MHz).
I installed Ubuntu 22.04 on it and then switched to i3.
Super happy, except for the battery lifetime, and no glitches.
Thank you @david.monnom I’ll try that out ASAP:

1 Like

I think my freezing is due to my gpu already being at 45 percent idle with two displays connected to it.

Something wrong with the copying from the primary to secondary copying algorithm is slow

Edit: doesnt occur on wayland gnome because this gpu copying is handled by compositor or some else who knows

1 Like

Sad that even as more users of non rolling release distros start to get affected, they arrive here, with some appearing to be shit hot experts, but they haven’t been able to help crack this issue yet.

They have not been helped to help crack the issue. THIS is why you open source drivers. What has been open sourced is alpha, worse than slow Nouveau, and my device is one generation before what that code can support anyway.

I’ve been running Nvidia only the last couple of months, but I want to run hybrid so I can have more than just 2 screens. That’s the reason I got this laptop. I went back to hybrid just a few hours ago, I’m on 545.29.06, everything is running max power, but my couch TV monitor just froze because I fullscreened then unfullscreened mpv too quickly.

This is absurd.

I cannot wait to buy the fastest “AMD Advantage” laptop in the next couple of months when I can get the funds through. I WOULD have bought the fastest AMD+Nvidia if it were not for this. It would have been really nice to try some CUDA orientated projects, but having more than two screens is a necessity, and the AMD device will at least be good enough for inference given the newer Windows AI hardware requirements.

3 Likes

@amrits, is there any priority for the ticket this year? Is it assigned to anyone or just hangs in limbo?
Thank you.

I’m experiencing this same issue on a Dell G15 with a RTX 4050 GPU on Ubuntu 23.10. I have a dual boot installation with Windows 11 and the issue occurs only on Ubuntu.
This is very disappointing to see that NVidea is taking so long to fix it, as we all here invested good money in hardware that was supposed to boost our productivity. Not otherwise.

Thanks for following up - apologies for my slow reply.

I have a mbeat Elite 7in1 https://www.mbeat.com.au/elite-7-in-1-multifunction-usb-c-3-2-hub.html. I stiill get the freezes with 535.129.03 (and 535.146.02) - resizing glxgears when offloading does the trick.

The output of xrandr --listproviders

Providers: number : 2
Provider 0: id: 0x54 cap: 0x9, Source Output, Sink Offload crtcs: 4 outputs: 2 associated providers: 1 name:Unknown AMD Radeon GPU @ pci:0000:05:00.0
Provider 1: id: 0x1da cap: 0x2, Sink Output crtcs: 4 outputs: 2 associated providers: 1 name:NVIDIA-G0

And xrandr --listactivemonitors

Monitors: 2
 0: +*DP-1-0 1920/527x1080/296+1920+0  DP-1-0
 1: +eDP 1920/344x1080/194+0+0  eDP

0 is the external monitor (set to primary - also happens when non-primary), and 1 is the laptop monitor.

I’ve also attached the output from xrandr --verbose > xrandr_nvidia_RTX_2600_mbeat_hub_hdmi.txt.

xrandr_nvidia_RTX_2600_mbeat_hub_hdmi.txt (11.6 KB)

I’m rather sympathetic to NVIDIA as I’m sure they have a heap of competing priorities … but I’m almost certainly going to switch to AMD for my next laptop as they’re simply doing a better job (from what I hear).

Hi @stephematician
Could you please share nvidia bug report so that I will try to match notebook as closely as possible for local repro.
Also please let me know if you are making any specific display settings in your setup to trigger issue.

Hi @hjcosta.dev
Please share nvidia bug report and confirm how are you connecting external displays to the notebook along with reliable repro steps.

I can finally reproduce issue on below setup where I initially observed that display freezes for a second or two and eventually freezing the external display completely.
Acer Nitro AN515-58 + Debian GNU/Linux 12 + kernel 6.1.0-17-amd64 + NVIDIA GeForce RTX 3060 Laptop GPU + Driver 545.29.02 + Display GBT AORUS F127Q-P with HDMI connection + Resolution 2560 x 1440 + Refresh rate as 60 Hz
This will help us now to debug issue much easier, apologize for delay in local reproduction.

4 Likes

Yes! Sweet! Happy hunting :)