While playing game (Lutris/wine) computer crashes, displays are disconnected

jarek_dudzinski · April 22, 2024, 8:59pm

nVidia RTX 3060Ti, driver 550.67 (latest from the distro).

On Pop! OS (latest updates installed), while playing games using Lutris (Windows games, using wine) displays got disconnected (“no signal” message), computer crashes (I know this because of the sound - last played one is looped and plays non-stop). It started ocuring about 2 days ago, before everything worked ok.

I have found such errors in logs from the time it crashed:

nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67e:4:0:0x0000000f
NVRM: Xid (PCI:0000:01:00): 79, pid=‘’, name=, GPU has fallen off the bus.
NVRM: GPU at PCI:0000:01:00: GPU-18b3f147-bc78-3f10-613e-057910c70878

Otherwise it works normal while using 3D apps (Blender, Plasticity, …).

nvidia-bug-report.log.gz (523.0 KB)

EDIT: turned on iGPU and connected second monitor to it, to see what is happening when the crash occurs. GPU was around 50-55 deg C while playing the game, computer didn’t crashed, just the game - the display with nVidia got disconnected and GPU have “fallen out of the bus”. System temperatures were also around 45 C.
Under windows everything works perfectly, even with way more demanding games. Apps like Blender (generally, 3D) and GPU rendering works normally, both under Linux and Windows.

jarek_dudzinski · April 23, 2024, 1:23pm

UPDATE: I found that if I switch PowerMizer setting from “Auto” to “Prefer Maximum Performance” - the crashes are not occuring, at least for now. Found it somewhere in the internet, it was referring to some other problem, but tried it anyway.

generix · April 23, 2024, 1:57pm

You’re getting loads of pcie errors from your nvme device, breaking the bus so the nvidia gpu flies off. Please try disabling aspm by setting kernel parameter pcie_aspm=off
If that doesn’t help, check your nvme connection, check for a bios update.

jarek_dudzinski · April 23, 2024, 7:10pm

It have nothing to do with that NVME drive error - it’s behaving like that from forever - and the problem started occurring only three days ago. BIOS is the latest from Asus (motherboard: Asus ROG Strix Z790-I). Checked disk connection.

UPDATE: PowerMizer trick worked yesterday for some time. Today back to crashes.

jarek_dudzinski · April 23, 2024, 10:08pm

UPDATE: I have removed all nvidia drivers and installed -server version.

main problem still exists
not getting errors about NVME anymore, I’m assuming those were caused by nvidia drivers (it’s basically the only change I have made)
pcie_asmp=off didn’t fix the problem (nor nvme errors)

generix · April 24, 2024, 7:20am

Then you’re down to Xid 79 standard procedures. Monitor temperatures, reseat power connectors/the card in its slot, check/replace PSU. If that doesn’t yield anything, the gpu is on its way out.

jarek_dudzinski · April 24, 2024, 10:50am

I would agree with that - if … there would be any issues under Windows. No crashes, problems, etc. Same games running flawlessly.
And I’m using this system mainly for work - big 3D projects and GPU rendering, still without any issues.

generix · April 24, 2024, 10:54am

Gaming is neither graphics nor cuda, Windows is not Linux, but an Xid 79 is still an Xid 79.

jarek_dudzinski · April 24, 2024, 11:04am

What are you talking about? :D If the card works under windows without any problems - and NOT under Linux - it’s software issue. Also, under Linux I was playing rather not demanding titles, like WoT or Fallout 4 - with GPU temps around 50-60 deg C. Under windows it runs Cyberpunk without a glitch nor crashes :)

Xid 79 - at least by NVidia offical docs - can be hardware issue, right - but also might be a problem with the driver (source: XID Errors :: GPU Deployment and Management Documentation) - and everything points to second option.

generix · April 24, 2024, 11:08am

No. Forget about that, it’s an illusion you’re talking yourself into.
It can be a driver issue, but only on notebooks in very, very, very rare cases, mostly model specific. You have a desktop., Xid 79 is hardware, always. You’re not the one in a billion case.

jarek_dudzinski · April 24, 2024, 11:17am

So why everything work correctly under Windows, even in more taxing workloads? There is some problem in Linux - either driver, or something with support for other components.
Can it be, for example, problem with motherboard support under linux? It’s specific one, as it’s ITX. One of nvme slots (the one that was throwing errors) can be connected directly to CPU PCIE or bridge - in the first case it supports Gen5 (but GPU is limited to 8x - it’s called bifurcation in bios), in second Gen4 and GPU have full 16x lanes to itself. I’m using second option.

generix · April 24, 2024, 11:26am

Might be. Telling by the amount of ACPI errors in the log, the system bios doesn’t have a good quality. Checking for an update is always worth a shot.
If you can’t fix the pcie errors from the nvme, you should at least quieten them, setting pci=noaer as kernel parameter. The messages itself are also blocking the bus. To check for PSU issues, you can try limiting clocks to avoid boost spikes (the linux driver is clocking more aggressively than the Windows driver) e.g. nvidia-smi -lgc 300,1400

jarek_dudzinski · April 24, 2024, 11:29am

As I mentioned before, those nvme errors stopped when I purged nvidia driver from the system and installed -server version. So there is something going on with the driver. BIOS is the latest one, 2202 from 9 days ago.

nvidia-smi … - how to check current values it’s using? can’t find that option in --help

EDIT: PSU is quality 850W SFX - overkill for 3060Ti and i7-13 series, should be good even for 3090.

jarek_dudzinski · April 24, 2024, 1:31pm

Wait a second… Your suggestion about nvidia-smi and clocking lead me to checking something.

I’ve checked what clock speeds are used in NVidia Settings: PowerMizer is showing 270-2160 MHz at level 4 (max). Then I checked specs of the card manufacturer (Gigabyte RTX 3060Ti Eagle): it states that max boost is 1695 MHz…

Is this me, or the driver is applying WAY to high clocks for my GPU? Overclock from ~1700 to 2160 MHz is huge…

UPDATE: after applying limits (sudo nvidia-smi -lgc 270,1695) no more crashes (for now), and … FPS jumped from 160-190 to ~320-350. There is definitely something wrong with the clock speeds the driver is applying by default…

generix · April 24, 2024, 2:56pm

Starting with Turing, “max clocks”! displayed by nvidia-smi and nvidia-settings are merely theoretical limits, never used and reached.

jarek_dudzinski · April 24, 2024, 3:01pm

Ummm… I don’t fully understand what you mean - the PowerMizer displays actual clock speeds (Graphics clock: …) And according to those my GPU reached this maximum of 2160 MHz when it was crashing - and by the manufacturer specs, it NEVER should.

Now, after applying nvidia-smi -lgc 270,1695 it’s reaching max of 1695 MHz… and no crashes.

How to make this setting permanent, even after restarts?

generix · April 24, 2024, 3:09pm

You’ll have to create a systemd unit for it.
If it’s crashing when boosting to high clocks, the psu is breaking down on power spikes.

jarek_dudzinski · April 24, 2024, 3:17pm

Sorry, but this is bullshit - I was using this PSU before and I was overclocking CPU (reached 5.1 at all cores without any issues) - it can withstand much more spikes. And for i7 and 3060Ti it’s overkill already at 850W.

The real problem lies in the clock speeds - why driver is applying such high clocks to GPU? I’ve checked it under Windows, and GPU clock never exceeded 1750 MHz.
This could lead to just killing GPU. We are talking about HUGE overclock from ~1700 to ~2150 MHz…

generix · April 24, 2024, 3:18pm

Do what you want, I’m out.

hitbuyi · July 23, 2024, 8:07am

I have the similar problem, my environment
0, ubuntu version: 20.04
1, nvidia driver version is 535.183.01,
2 wine version 8.0
3, wine application wechat3.7

ubuntu crashed casually when wechat is running, when nvidia is set to “Performance” mode

I set nvidia to "on-Demanded " mode to fix this problem, at cost of that nvidia GPU not used for other applications, just for computing

Topic		Replies	Views
RTX 3090: GPU has fallen off the bus (only Linux, on Windows everything is fine) Linux	8	2419	March 4, 2024
NVIDIA 515 - RTX 3060 - GPU has fallen off the bus Linux hw , nvbugs , kb	21	5300	March 15, 2025
[SOLVED - RMA] Freeze when gaming, multiple NVRM errors -Driver issues? Linux	8	5311	October 12, 2021
Driver 545.29.06 crashes seemingly randomly while playing games with Proton/Wine, sometimes too on login screen on pop-os Linux	6	1477	February 20, 2024
NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus - HP Studio G5 Linux	39	11483	March 18, 2025
RTX 3070 Ti falls off the bus on Razer Blade 15 2022 Linux	20	2673	October 24, 2023
Arbitrary Crashes / Segfaults with RTX 3070 on current driver-455 on Ubuntu 20.04 kernel 5.4.0-58-generic Linux	23	2446	February 25, 2021
NVIDIA RTX 3060 "Falls off the Bus" in current linux kernel with any nvidia driver (nouveau/nvidia/open) Linux	2	230	March 2, 2025
GPU has fallen of the bus Linux	15	7839	July 19, 2019
Keep getting "GPU has fallen off the bus" with 3090 cards on Gigabyte MZ32-AR1 Rev 3.0 motherboard Linux gaming	21	1014	August 10, 2025

While playing game (Lutris/wine) computer crashes, displays are disconnected

Related topics