Xorg (1.19-5, 1.20.1-5.3, 1.20.4-7) Seg Faults, but does not hang. User is returned to Log In Screen...

Hello,

I have users that have been reporting for some time that their workstations would unexpectedly log them out returning them to the gdm log in screen. I believe that it is related to a previously reported bug that Xorg is not handling animated cursors properly and something with the AnimCurNotifyTimer is causing the Xorg server to crash. I filed a support case with Red Hat saying as much. After some back and forth, RedHat directed me to contact NVidia because they do not have the means to debug the Nvidia drivers and as such cannot read the full backtrace of the Xorg crash.

The issue occurs semiregularly, but unable to be forcibly reproduced so far. The crash has occurred on multiple different set ups for displays, but all of them are running RHEL 7, KDE, multihead with each head set up as a separate X display (e.g. 3 heads: :0.0, :0.1, :0.2) with as many as 8 heads running across 2 NVS 510s.

Unfortunately I am not able to submit a full nvidia-bug-report.log.gz, as I have to sanitize anything I submit of any sensitive data before uploading it to help tickets. If there are specific outputs/logs that are needed I can likely get them sanitized and uploaded here (as I’ve done so far with my RedHat support case). Hopefully this is not a problem.

Thanks,
Nate Calonder
xorg_backtrace.txt (1.02 KB)

Which driver version are you running?

Currently running Driver Version 410.78 installed via the kmod RPM.

Did you check if the driver series v390 also exhibits that bug, i.e. if it is a regression? Also, did you try with a current v440 driver?

Yes, the bug was experienced on Driver version 390.25 and 390.48. It is not simple to try with a current driver as the user base that experiences this issue has been unable to explain how to replicate it. Attempts to replicate the bug in a test environment has been unsuccessful. The user base will receive version 430 of the driver in January, but not the v440. If I could figure out how to force the bug to occur I could test against v440 immediately though.

I will get one of the workstations the v430 driver this week ahead of the normal schedule. If there are any ideas or if I can get any specific logs or command outputs to help trouble shoot this issue please let me know. The red hat support claimed they would need NVidia analysis and supporting data if Xorg is believed to be the issue because they (red hat) cannot debug the NVidia graphics libraries. Thanks!