[BUG] NVIDIA v495.29.05 driver spamming dbus-enabled applications with invalid messages

Sounds odd, there’s definitely no more spam looking at dbus-monitor --system as root with 510.39.01 for me.

Have you tried removing any possible remaining workarounds and booting cleanly (service, old dbus .conf, etc…). Perhaps it’s confusing the drivers into thinking powerd is running, and so it starts using it.

That was a good call, it seems that the workaround files being present was enough to trigger the spam. Once I completely vanished the files (uninstalled the AUR package and not just disabled the service) and rebootet, the spam is gone.

The 510.39.01 BETA seems to be working well for me and I can confirm that the powerd dbus messages no longer spam.

Manual removal of workaround (the service disable and status commands probably aren’t required, better to be thorough):

systemctl stop nvidia-fake-powerd.service 
systemctl disable nvidia-fake-powerd.service 
rm -vf /etc/dbus-1/system.d/nvidia-fake-powerd.conf
rm -vf /etc/systemd/system/nvidia-fake-powerd.service
systemctl daemon-reload
systemctl status nvidia-fake-powerd.service

Hello, thanks for your solution.
I tried this on my Ubuntu 20.04 and maybe it works.
However, actually I face the issue like Error `Failed to establish dbus connection` when running `sim = gym.create_sim(...)`
When I tried your solution, it does not help and the error mentioned in this post still occurs, even a simple pyqt program may fail.
Would you please come and see how to fix the issue in this post?

Please read the entire thread, there are many suggestions including validation that the service is running and monitoring the system message bus for errors. It’s entirely possible that the software you’re trying to use calls a different bus name. If this is the case, it shouldn’t be difficult to adjust the workaround to accommodate.

sudo systemctl status nvidia-fake-powerd.service
sudo dbus-monitor --system

This “workaround” is only useful for specific driver versions within the 495 series in cases where applications use OpenGL/EGL. A much better solution might be to downgrade to 470 or upgrade to the 510 beta.

Just upgraded to 510.47.03 (current production branch release as of 2022-02-01) and everything seems fine. I had experienced the bug on 495.46, and had reverted to the 470 series driver.

I’m on Slackware64-current (no systemd).

Hey, dzy201415!
I’m interested in applying the binary patch, however I’m using a different version: 495.29.05. (Unfortunately, I cannot switch to v510 either, since “davinchi resolve” does not work properly with it at the moment).
Could you please at least provide me a general direction on how to create such a patch? I already have some minimal experience on patching binary files, but I need to know at least where to start looking for, like a function’s symbol or kinda.

P.S. I’m so sorry for necroposting on this one :o

I’d personally recommend to go back to the (still supported) 470 branch for now (current 470.103.01, was never affected by the dbus issue) if 510 is giving you issues rather than try to binary patch an old new feature branch that’s affected by known security vulnerabilities.

Yes, I will also suggest using the old one whenever possible. The binary patch is found by simply searching the reference of string “nvidia.powerd.server” in reverse tools (which could be in the procedure that generate the dbus message), and change jz → jmp after that. It will not prevent the driver trying to send the message, and might also cause unexpected behavior.

Has this been fixed by now?

Yes, in v510.

1 Like

Running up to date debian sid:

bhundven@mill:~$ dpkg -l linux-image-6.0.0-2-amd64
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                      Version      Architecture Description
ii  linux-image-6.0.0-2-amd64 6.0.5-1      amd64        Linux 6.0 for 64-bit PCs (signed)

I got the same dbus flooding running latest 515.76. @user16974 's solution worked for me.

Thinkpad X1 Extreme 4th Gen:
NVIDIA Corporation GA104M [GeForce RTX 3080 Mobile / Max-Q 8GB/16GB] [10de:249c]

EDIT: Also thinkfan seems to be much more stable.
EDIT2: Added link.

I’m on Fedora 37 with the following specs and running the driver version 520.56.06.

           /:-------------:\          hlriffel@raider-ge67hx
        :-------------------::        OS: Fedora 
      :-----------/shhOHbmp---:\      Kernel: x86_64 Linux 6.0.10-300.fc37.x86_64
    /-----------omMMMNNNMMD  ---:     Uptime: 17h 15m
   :-----------sMMMMNMNMP.    ---:    Packages: 2389
  :-----------:MMMdP-------    ---\   Shell: zsh 5.9
 ,------------:MMMd--------    ---:   Resolution: 3840x1080
 :------------:MMMd-------    .---:   DE: KDE 5.100.0 / Plasma 5.26.4
 :----    oNMMMMMMMMMNho     .----:   WM: KWin
 :--     .+shhhMMMmhhy++   .------/   GTK Theme: Orchis-Dark [GTK3]
 :-    -------:MMMd--------------:    Disk: 181G / 3.0T (7%)
 :-   --------/MMMd-------------;     CPU: 12th Gen Intel Core i9-12900HX @ 24x 4.9GHz [49.0°C]
 :-    ------/hMMMy------------:      GPU: Mesa Intel(R) UHD Graphics (ADL-S GT1)
 :-- :dMNdhhdNMMNo------------;       RAM: 7214MiB / 31795MiB

I’m also still getting the DBus flooding. I’m gonna use @user16974 's workaround for now until a fix is shipped for this regression.