When we downgrade to 545.23.08, everything starts working again. Both PCs have NVIDIA RTX A6000 cards which appear to still be supported. Other systems we have here with different Nvidia GPUs are nto affected. Only machines with one or more NVIDIA RTX A6000 cards. One one of the affected machines I ran nvidia-bug-report.sh under both 545.23.08 and 550.54.14. When I ran it on 550.54.14 it crashed the machine and it restarted. But it must have produced a partial log. nvidia-bug-report-545.23.08.log.gz (604.8 KB) nvidia-bug-report-550.54.14.log.gz (172.7 KB)
I can also attest to issues with the latest drivers and a6000s. Problems showing up on machines using intel & amd processors, tested on both, both using a6k’s.
In addition to the original posters issues, we’re also seeing issues extending into the OS displays, with icon not showing up, words half displaying, and in kde/plasma w/ wayland which implemented sync, experiencing constant backtracking and flickering issues on many apps. nvidia-bug-report.log.gz (1.0 MB) ← this report is in X11 which has less flickering issues but still the main driver issues mentioned by original poster.
is there any ETA to get a fix into the 550 driver?
Were currently in a really bad situation for these enterprise cards, since the 535 does only support CUDA 12.2 and our distribution does ship CUDA 12.4 as default.
The 545 (last working) driver lacks in support with the 6.8 Kernel currently and needs to be patched/fixed.
This should be really fixed as fast as possible, since these cards a majorly thought for CUDA work.