Also, complete hats off to this Uli1234 guy. You are a hero for doing these tests and identifying this as a potential fix. A little bizzare that, for such a ubiqitous problem, NVIDIA chimes in so infrequently. Certainly has soured my experience
Have this same issue with ubuntu 20 (gnome), ryzen 3700x, asus x470, cooler master h100i and two monitors at 2560*1080.
I tried with xfce and it only got worse, way slower/laggy.
Mostly happens when sharing screen with google hangouts or zoom.
I tried with newer kernel versions, older drivers, forcing frequencies and disabling cool and quiet, disabling hardware acceleration in firefox and chrome, none worked
I am forcing powermizer to run at max performance and that seems to be working for me.
What frequency range did you set when locking the GPU?
It seems, you just donāt want the card to go to P8 state, because when exiting from that state, the Xid-61 pops up (randomly). Maybe with your minimum locked frequency P8 was also accessible. Just a thought.
The minimum frequency you have to set could differ from the model of the card and the chip you have on it (2060, 2070 etc.)
In my case the datasheet said 300MHz as the lowest freq, so I just set 1000 as my minimum. If your min freq. is 900Mhz than you might have to set 1200ā¦
Good to know. I mustāve missed that in previous posts. Iām not sure what the minimum clock is for my card so Iāll have to do some digging, or Iāll just slowly increment the low setting until I bump it out of P8.
Iām actually running pretty strong right now with SMT and XMP disabled (>24 hours at this point, which is a first in a while), so I want to try and be methodical about it and run until I get the Xid-61 error. After that Iāll redo the frequency locking.
We have a reproduction of the problem internally, thank to Uli1234 who provided affected hardware.
We have a root cause. The problem happens when the PCIe Gen switches from 3 to 1, and it is a NVIDIA bug. Iāll update this thread when we have a fix.
Just wanted to add that since locking the frequencies I have gone 7 days without seeing this issue. This is longer than the system has ever run before. Thank you @Uli1234 for helping me avoid having an angry customer!
In my case, Xid 61 seems to have been apparently solved (no issue for 11 days) by disabling SMT in bios. I hope this is not a different bug. @ahuillet, could that be related to the gen switch bug you are tracking ?
Are other segfaults and errors caused by this as well? Iāve been getting random crashing for awhile now and it always seems to be when the GPU ramps-down either because Iāve exited a game or because the game itself isnāt intensive enough.
Contacted Nvidia about it and was told it was a PSU issue.
Dudeee I need this fix soo badly, Iām getting this error like 2 times a day on my working computer! It is so freaking bad, I have to restart my computer with all my working environment running, aaaaaaahhh.
You can put those commands in a start-up script as well to avoid typing them in every time you startup the computer.
In the second command, the lower value should hinder your card to go to P8 state (=PCIe Gen1), the higher value should be your graphics card boost frequency. Please try out and report back
workaround:
set in BIOS:
suspend to RAM ->DISABLED;
Global C states Control -> DISABLED
ACPI_CST C1 Declaration -> DISABLED
PCIE Reset Control -> DISABLED
set nvidia-smi pm 1, nvidia-smi lgc 1600,1605 (for 2070S)
Referring to my post from January 2020, do you guys think that I could get my 1660ti to work with such an old computer?
These issues seem to relate to PCIe Gen 1 and I think my motherboard is PCI Gen 1. I canāt get my computer to boot with the card, unless I have the open-source Nouveau-drivers installed on my Archlinux-based machine. Can these clock and power settings be applied to my GRUB-config as kernel parameters somehow?