I’ve been noticing in the last week or two ( since the 550.90.07 update ) one of my GPU’s will lock up. I’ve tried to reset it with nvidia-smi but since the X Session is running fine on the other GPU it won’t reset the selected “frozen” GPU so I’m forced to reboot.
[ 5.813655] nvidia-gpu 0000:26:00.3: i2c timeout error e0000000
[ 5.813661] ucsi_ccg 1-0008: i2c_transfer failed -110
[ 5.813664] ucsi_ccg 1-0008: ucsi_ccg_init failed - -110
[ 5.813668] ucsi_ccg 1-0008: probe with driver ucsi_ccg failed with error -110
[14489.017492] NVRM: GPU at PCI:0000:25:00: GPU-2dbe0533-7ed3-d4b8-c4dd-be5ae7f5fe53
[14489.017497] NVRM: Xid (PCI:0000:25:00): 32, pid='<unknown>', name=<unknown>, Channel ID 0000003e intr 00040000
[15964.584999] NVRM: Xid (PCI:0000:25:00): 32, pid='<unknown>', name=<unknown>, Channel ID 0000003e intr 00040000
[54367.958000] NVRM: Xid (PCI:0000:25:00): 11, pid='<unknown>', name=<unknown>, Ch 00000010 Cl 0000c197 Off 0000029c Data 00001011
As an addendum rolling back to 550.78 the GPU has the same issue (as it has for at least a year) however it can recover where as with the 550.90.07 if it timesout/freezes recovery never happens.
Testing 555.58-2 ATM.
555.58 I no longer have the timeouts or lockup/recovery of 550.78 but everything is dropping frames and herky jerky. From video playback in VLC to games.
0 GTX 1060 ← drivers for the last year see this GPU lock up / recover a few times a day. 550.90 saw this GPU lock up with no recovery. 555.58 GPU no longer locks up, both GPU’s performance is degraded.
1 GTX 1660
550.78 “hiccups” spit out
kernel: NVRM: Xid (PCI:0000:26:00): 69, pid=‘’, name=, Class Error: ChId 0033, Class 0000c597, Offset 000017d8, Data 00000021, ErrorCode 0000000c
I have the Timeouts too.
System:
Kernel: 6.9.6-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 14.1.1
clocksource: tsc avail: hpet,acpi_pm
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=9bdc1bb8-710e-4a2b-a1fc-d771d79d9fb1 rw rootflags=subvol=@
quiet resume=UUID=3de67b4e-0081-4321-8b87-d6348302d4da loglevel=3 ibt=off
Desktop: Cinnamon v: 6.2.2 tk: GTK v: 3.24.42 wm: Muffin v: 6.2.0 tools:
avail: cinnamon-screensaver,xautolock vt: 7 dm: LightDM v: 1.32.0
Distro: Garuda base: Arch Linux
Machine:
Type: Desktop Mobo: ASUSTeK model: CROSSHAIR VI HERO v: Rev 1.xx
serial: <superuser required> part-nu: SKU UEFI: American Megatrends v: 7704
date: 12/16/2019
CPU:
Info: model: AMD Ryzen 9 3900X bits: 64 type: MT MCP arch: Zen 2 gen: 3
level: v3 note: check built: 2020-22 process: TSMC n7 (7nm)
family: 0x17 (23) model-id: 0x71 (113) stepping: 0 microcode: 0x8701013
Topology: cpus: 1x cores: 12 tpc: 2 threads: 24 smt: enabled cache:
L1: 768 KiB desc: d-12x32 KiB; i-12x32 KiB L2: 6 MiB desc: 12x512 KiB
L3: 64 MiB desc: 4x16 MiB
Speed (MHz): avg: 2265 high: 3767 min/max: 2200/4672 boost: enabled
scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 2190 2: 3767
3: 1987 4: 2200 5: 2223 6: 2200 7: 2200 8: 2200 9: 2200 10: 2200 11: 2200
12: 2196 13: 1989 14: 2458 15: 2025 16: 2200 17: 2200 18: 2200 19: 2200
20: 2200 21: 2200 22: 2200 23: 2540 24: 2200 bogomips: 182404
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Vulnerabilities: <filter>
Graphics:
Device-1: NVIDIA GP104 [GeForce GTX 1070] vendor: ASUSTeK driver: nvidia
v: 550.90.07 alternate: nouveau,nvidia_drm non-free: 545.xx+ status: current
(as of 2024-06; EOL~2026-12-xx) arch: Pascal code: GP10x
process: TSMC 16nm built: 2016-2021 pcie: gen: 3 speed: 8 GT/s lanes: 8
link-max: lanes: 16 ports: active: none off: DVI-D-1,HDMI-A-1,HDMI-A-2
empty: DP-1,DP-2 bus-ID: 0b:00.0 chip-ID: 10de:1b81 class-ID: 0300
Device-2: NVIDIA GP104 [GeForce GTX 1070] vendor: CardExpert
driver: nvidia v: 550.90.07 alternate: nouveau,nvidia_drm non-free: 545.xx+
status: current (as of 2024-06; EOL~2026-12-xx) arch: Pascal code: GP10x
process: TSMC 16nm built: 2016-2021 pcie: gen: 1 speed: 2.5 GT/s lanes: 8
link-max: gen: 3 speed: 8 GT/s lanes: 16 ports: active: none empty: DP-3,
DP-4, DP-5, DVI-D-2, HDMI-A-3 bus-ID: 0c:00.0 chip-ID: 10de:1b81
class-ID: 0300
Device-3: Logitech StreamCam
driver: hid-generic,snd-usb-audio,usbhid,uvcvideo type: USB rev: 3.2
speed: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 bus-ID: 4-1:2 chip-ID: 046d:0893
class-ID: 0300 serial: <filter>
Display: x11 server: X.Org v: 21.1.13 with: Xwayland v: 24.1.0 driver: X:
loaded: nvidia gpu: nvidia,nvidia-nvswitch display-ID: :0 screens: 1
Screen-1: 0 s-res: 5760x1080 s-dpi: 92 s-size: 1590x301mm (62.60x11.85")
s-diag: 1618mm (63.71")
Monitor-1: DVI-D-1 mapped: DVI-D-0 note: disabled pos: primary,center
model: BenQ GL2450H serial: <filter> built: 2015 res: 1920x1080 hz: 60
dpi: 92 gamma: 1.2 size: 531x298mm (20.91x11.73") diag: 609mm (24")
ratio: 16:9 modes: max: 1920x1080 min: 640x480
Monitor-2: HDMI-A-1 mapped: HDMI-0 note: disabled pos: right
model: Dell SE2422H serial: <filter> built: 2022 res: 1920x1080 hz: 60
dpi: 93 gamma: 1.2 size: 527x296mm (20.75x11.65") diag: 604mm (23.8")
ratio: 16:9 modes: max: 1920x1080 min: 640x480
Monitor-3: HDMI-A-2 mapped: HDMI-1 note: disabled pos: left
model: Dell SE2422H serial: <filter> built: 2022 res: 1920x1080 hz: 60
dpi: 93 gamma: 1.2 size: 527x296mm (20.75x11.65") diag: 604mm (23.8")
ratio: 16:9 modes: max: 1920x1080 min: 640x480
API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 1
drv: nvidia device: 4 drv: swrast gbm: drv: nvidia surfaceless: drv: nvidia
x11: drv: nvidia inactive: wayland,device-2,device-3
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.90.07
glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce GTX 1070/PCIe/SSE2
memory: 7.81 GiB
API: Vulkan v: 1.3.279 layers: 8 device: 0 type: discrete-gpu
name: NVIDIA GeForce GTX 1070 driver: nvidia v: 550.90.07
device-ID: 10de:1b81 surfaces: xcb,xlib device: 1 type: discrete-gpu
name: NVIDIA GeForce GTX 1070 driver: nvidia v: 550.90.07
device-ID: 10de:1b81 surfaces: N/A
Audio:
Device-1: NVIDIA GP104 High Definition Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 8
link-max: lanes: 16 bus-ID: 0b:00.1 chip-ID: 10de:10f0 class-ID: 0403
Device-2: NVIDIA GP104 High Definition Audio vendor: CardExpert
driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 8
link-max: lanes: 16 bus-ID: 0c:00.1 chip-ID: 10de:10f0 class-ID: 0403
Device-3: AMD Starship/Matisse HD Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
bus-ID: 0e:00.4 chip-ID: 1022:1487 class-ID: 0403
Device-4: Logitech StreamCam
driver: hid-generic,snd-usb-audio,usbhid,uvcvideo type: USB rev: 3.2
speed: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 bus-ID: 4-1:2 chip-ID: 046d:0893
class-ID: 0300 serial: <filter>
API: ALSA v: k6.9.6-zen1-1-zen status: kernel-api with: aoss
type: oss-emulator tools: N/A
Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
Server-2: PipeWire v: 1.0.7 status: active with: 1: pipewire-pulse
status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: kernel
pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: e000 bus-ID: 05:00.0
chip-ID: 8086:1539 class-ID: 0200
IF: enp5s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
IF-ID-1: virbr0 state: down mac: <filter>
Info: services: NetworkManager,systemd-timesyncd
Drives:
Local Storage: total: 6.37 TiB used: 727.85 GiB (11.2%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 PRO with
Heatsink 1TB size: 931.51 GiB block-size: physical: 512 B logical: 512 B
speed: 63.2 Gb/s lanes: 4 tech: SSD serial: <filter> fw-rev: 4B2QGXA7
temp: 36.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 870 QVO 1TB
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 2B6Q scheme: GPT
ID-3: /dev/sdb maj-min: 8:16 vendor: Hitachi model: HUA723020ALA641
size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: HDD rpm: 7200 serial: <filter> fw-rev: A840 scheme: GPT
ID-4: /dev/sdc maj-min: 8:32 vendor: Western Digital
model: WD10EFRX-68PJCN0 size: 931.51 GiB block-size: physical: 4096 B
logical: 512 B speed: 6.0 Gb/s tech: HDD rpm: 5400 serial: <filter>
fw-rev: 0A82 scheme: MBR
ID-5: /dev/sdd maj-min: 8:48 vendor: Western Digital
model: WD10EADS-00M2B0 size: 931.51 GiB block-size: physical: 512 B
logical: 512 B speed: 3.0 Gb/s tech: N/A serial: <filter> fw-rev: 0A01
scheme: MBR
ID-6: /dev/sde maj-min: 8:64 vendor: Samsung model: SSD 860 EVO 1TB
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 1B6Q scheme: GPT
Partition:
ID-1: / raw-size: 896.82 GiB size: 896.82 GiB (100.00%)
used: 727.85 GiB (81.2%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 584 KiB (0.2%) fs: vfat dev: /dev/sda1 maj-min: 8:1
ID-3: /home raw-size: 896.82 GiB size: 896.82 GiB (100.00%)
used: 727.85 GiB (81.2%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
ID-4: /var/log raw-size: 896.82 GiB size: 896.82 GiB (100.00%)
used: 727.85 GiB (81.2%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
ID-5: /var/tmp raw-size: 896.82 GiB size: 896.82 GiB (100.00%)
used: 727.85 GiB (81.2%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
ID-1: swap-1 type: zram size: 31.27 GiB used: 0 KiB (0.0%) priority: 100
comp: zstd avail: lzo,lzo-rle,lz4,lz4hc,842 max-streams: 24 dev: /dev/zram0
ID-2: swap-2 type: partition size: 34.39 GiB used: 0 KiB (0.0%)
priority: -2 dev: /dev/sda3 maj-min: 8:3
Sensors:
System Temperatures: cpu: 62.0 C mobo: 27.0 C gpu: nvidia temp: 55 C
Fan Speeds (rpm): cpu: 1371 case-1: 0 case-2: 1157 case-3: 0 gpu: nvidia
fan: 0%
Power: 12v: 12.23 5v: N/A 3.3v: N/A vbat: 3.21
Info:
Memory: total: 32 GiB available: 31.27 GiB used: 14.23 GiB (45.5%)
Processes: 581 Power: uptime: 1h 43m states: freeze,mem,disk suspend: deep
avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
suspend, test_resume image: 12.46 GiB services: csd-power,upowerd
Init: systemd v: 256 default: graphical tool: systemctl
Packages: pm: pacman pkgs: 2141 libs: 618 tools: paru,yay Compilers:
clang: 17.0.6 gcc: 14.1.1 Shell: garuda-inxi default: fish v: 3.7.1
running-in: gnome-terminal inxi: 3.3.35
Garuda (2.6.26-1):
System install date: 2024-01-20
Last full system update: 2024-06-26
Is partially upgraded: Yes
Relevant software: snapper NetworkManager dracut nvidia-dkms
Windows dual boot: Probably (Run as root to verify)
Failed units: vmware.service
nvidia-bug-report.log.gz (3.7 MB)
For the last year or so every time I get the hang/timeout it’s on my 1060. Today for the first time ever my 1660 was the one that hung while the 1060 kept going…off a fresh boot no less lol.
Jul 05 07:40:04 kernel: NVRM: GPU at PCI:0000:26:00: GPU-c1292f17-fa5f-a43c-1f12-c6036dd190cc
Jul 05 07:40:04 kernel: NVRM: Xid (PCI:0000:26:00): 32, pid=‘’, name=, Channel ID 00000010 intr0 00040000
Jul 05 07:40:04 kernel: NVRM: Xid (PCI:0000:26:00): 32, pid=‘’, name=, Channel ID 00000010 intr0 00040000
While I’m back on the 550.78 (and I’m sure they won’t care about looking into this any longer) these are the only drivers that semi work right now.
What is concerning is normally on the 550.78 the GPU that hangs will recover in say 10-20 seconds. My 1660 did not and was completely hung like on 550.90.07. Meanwhile I can’t use 555.58-2 because it makes my GPU’s stutter making things unusable if not just very unpleasant.
nvidia-bug-report.log.gz (7.1 MB)
19topgun93 I’m not sure if it will help you but you could try blacklisting the nvidia i2c driver. I am trying that though it’s not stopping the hangs.
sudo|run0 echo ‘blacklist i2c_nvidia_gpu’ > /etc/modprobe.d/blacklist_i2c-nvidia-gpu.conf
I wait for a new Driver. I had also the DVI Problem. I hope it will be fixed. But I’m really considering to change to full AMD for the next setup. AMD and Linux is a lot better I think. I want to use my second GPU for a Windows VM.
Not to clutter and derail but I’ll be moving back to AMD myself as well. If nVidia kept the GTX line going and took their drivers seriously I’d stay nVidia but neither seems to be the case these days. Too much money in ML for them to care about anything else.
The caveat with AMD’s drivers is they are broken by design for a multiGPU setup. Everything/one is pushing for Wayland which cripples proper multiGPU. So going AMD means xrandr “providers” and zero performance. OR you have to break each GPU into a separate machine and network things.
Back on topic sure we can wait for a new driver but this is nearly 2 years of broken drivers for me. Next one will be something else. The nVidia drivers feel like a 1940/50’s sitcom gag where the plumber fixes a leak to have it spring up elsewhere, so he sticks his finger on it while he thinks of what to patch it with, then another new leak. Pretty soon all his appendages are plugging something, the audience is laughing at the physical comedy and then it explodes -scene. Only for me I can’t afford to laugh off two GPU’s when replacing them these days costs as much as a car.
I’m thinking if they manage to get 555 working I will lock them and stay there even if there is a security issue…which there have been plenty of those lately. Seems that’s the rub. nVidia, insecure and works(ish) or secure to where not even the owner can use it.
I also have the Problem with the Money. But I have got cheap a second 1070 GTX GPUs second Hand. And the first setup was for Win 10 and a second Hand PC. The new parts were the PS, CPU and Ram.
I bought the second GPU because of the VM. So it is a bit disappointed. So I don’t want to use this as multiGPU for normal. I would assume, that I can better forward the GPU under Linux with AMD?
The reset on AMD is better for Lookingglass/passthrough. For me Wine does the trick but I run 12 screens. I used to have more GPU’s running but my VESA mount died so now I’m down to 8 screens and 2 GPU’s heh. (and a pile 'o screens in the corner)
As a side note it appears we’re both Garuda users. Though my base is Garuda (XFCE) I run CTWM these days since GTK 3+ has broken by design XScreen enumeration.
Yeah, I’m since 7-8 Month on Garuda Linux. Steps was Kubuntu, Linux Mint, Garuda :D And now I want a Linux VM because of my Wheel that I cannot use. It is a TMX from Thrustmaster. I don’t have something else I would play on WIndows at the Moment :D
I’m an old BSD guy but I moved to Linux early 2000’s for “convenience.” heh. Gentoo, Slack eventually deb/Ubuntu…until the slow releases and GTK broke everything. (long story) Now most the network is pure Arch, Garuda or Rhino.
Comically I remember when nVidia partnered with Gentoo on drivers and an Unreal port…and how it was nearly 5 years ago that I had stable drivers…
I’m always Windows since 2019 I’m on Linux with dual Boot on Windows. But my windows is broke and cannot Update xD
I serviced Win/Mac for years, even if I was OK with Windows before, Win 7 was the last version I’d ever consider as usable/functional. Sadly sounds like you are half in half out. It’s always kinda painful when someone has previously bought stuff (like your driving wheel), mixers, stream decks (other trendy BS) for Windows and when you transition you find out things don’t work because it’s all closed off. Well welcome to your journey of future buying things that are not locked down proprietary crap.
That said it may work eventually. Back in my BeOS days it was almost unimaginable to game on an “altOS” but here we are…if only our GPU’s would work. (Gotta keep it semi on topic and glaring at nVidia.) These days my audio interfaces and even proprietary DAW all run in *nix.
For reference I got fed up with all the issues. Which includes the little lockup/recoveries that I’ve been putting up with for nearly a year now so I’ve gone all the way back to 525.147.05.
While everything is gasp actually working video wise the settings still suffer from an issue I’ve had zero response/fixes for for a few years now where in they can’t parse their own syntax for settings. When you save they denote the GPU for the setting, when they load they have no clue what that syntax means. While it might seem unrelated it simply illustrates how long nVidia drivers/tools have been going down hill. Fixes just trade one problem for another these days as illustrated earlier with the sitcom/plumber reference.
https://forums.developer.nvidia.com/t/nvidia-settings-rc-never-loaded/256989