[BUG] crash in libnvidia-glcore.so.387.34 thread

dinosaur · December 6, 2017, 2:41pm

I just got the following crash while running a Second Life viewer (with the NVIDIA drivers in threaded mode):

0   com.secondlife.indra.viewer	0x115cae1 LLAppViewerLinux::handleSyncCrashTrace() + 209
1   com.secondlife.indra.viewer	0x1929cee default_unix_signal_handler(int, siginfo_t*, void*) + 1198
2   unknown	0x7f60f5b84ed0 /lib64/libpthread.so.0(+0x11ed0) [0x7f60f5b84ed0]
3   unknown	0x7f60d5524fae /usr/lib64/libnvidia-glcore.so.387.34(+0xd46fae) [0x7f60d5524fae]
4   unknown	0x7f60d5543e19 /usr/lib64/libnvidia-glcore.so.387.34(+0xd65e19) [0x7f60d5543e19]
5   unknown	0x7f60d5666d36 /usr/lib64/libnvidia-glcore.so.387.34(+0xe88d36) [0x7f60d5666d36]
6   unknown	0x7f60d566ce5d /usr/lib64/libnvidia-glcore.so.387.34(+0xe8ee5d) [0x7f60d566ce5d]
7   unknown	0x7f60d69b950c /usr/lib64/libGLX_nvidia.so.0(+0xaf50c) [0x7f60d69b950c]

I’m also attaching the nvidia-bug-report.log file.
nvidia-bug-report.log.gz (219 KB)

dinosaur · December 9, 2017, 3:40pm

I just encountered the same bug again (exact same stack trace), something I never encountered in the last decade I used SL viewers, with former NVIDIA driver versions… I think I’m going to downgrade to v384…

ahuillet · December 11, 2017, 6:19pm

Can you reproduce the problem reliably? If so, can you provide the exact steps to follow for us to observe it?
As a workaround, can you try setting __GL_THREADED_OPTIMIZATIONS=0?

dinosaur · December 11, 2017, 6:52pm

Alas, I did not find any common ground to the two crashes I got, and consequently could not infer a way to reproduce this bug (else, I’d already reported it, together with a gdb session log).

No way ! I’m not going to loose 30% of frame rate… I simply downgraded to the v384 driver for now…

ahuillet · December 12, 2017, 2:50pm

If there is a bug, we would like to fix it, so it would be nice if you could provide us with more information.
Ideally, a reliable way to reproduce the problem would be great. Failing that,
this frame:
2 unknown 0x7f60f5b84ed0 /lib64/libpthread.so.0(+0x11ed0) [0x7f60f5b84ed0]
is unexpected. What’s your glibc version? Can you provide the output of nm -D /lib64/libpthread.so.0, and if possible install a glibc with debug symbols, and addr2line -e /lib64/libpthread.so.0 0x11ed0

In addition, is there any way to disable SecondLife’s SIGSEGV handler (I’m assuming it’s a SIGSEGV), so that you can Can you please run under gdb, catching the crash when it happens? I’m interested in the stack trace (to see if it’s similar), but more importantly in the register values (“info registers” in gdb) at the crash point.

Thanks

EDIT: remove useless instruction, and fix broken sentence

dinosaur · December 12, 2017, 5:08pm

I always provide all the information I can gather, however you must understand that I cannot afford loosing time on such issues. What matters for me is that I don’t get a crash, so I’d rather (and actually did) downgrade to the last (proven) stable driver version than run hours-long sessions under gdb in the hope to get the crash to reproduce under it…

Indeed… But like I explained, I did not find any common ground for the two crashes I got and since it can take hours for them to occur in a session (and they won’t even occur in every hours-long session), it could take months before I would figure out what situation, action, or 3D object is causing them…

Your best bet is to look at your sources and, based on the stack trace I provided, find what part of your code is racy, or fails to lock or test whatever mutex… Since the bug occurred only in v387.34 for me (and I’m running every release or beta driver, whichever was last released), I would expect it to have been introduced in a recent commit…

I’m running glibc v2.26. My Linux distro is PCLinuxOS (a rolling release distro and I keep my system up to date with it).

Sure thing. I’m attaching the result to this message.

Alas, PCLinuxOS does not provide debug symbols packages… :-/

Nope, not without modifying the sources and recompiling the viewer…

I could run the viewer under gdb (with the viewer debug symbols loaded), but since the crash happens in a thread initiated from your driver (and without the debug symbols for pthread & glibc), it would provide no additional info, I’m afraid…

phtread_dyn_symbols.txt (12.1 KB)

ahuillet · December 13, 2017, 12:46pm

Please clarify: you see the problem in 387.34, but not in 387.12?
If so, that does narrow down the range on our side indeed.

Yes, running under gdb and catching the crash would actually provide a little more information. (You don’t actually need to disable the signal handler of the application, gdb will see the signal first.) That information is the state of the registers at the crash point.

dinosaur · December 13, 2017, 5:58pm

Correct…

BUT: I did not run v387.12 for a very long time (perhaps a week or two, IIRC) and since it is a bug that does not show off at every session, I may just had some good luck…
On the other hand, I ran the various (beta and stable) releases of v384 for a long time (and running it right now again) and know for sure it is bug-free in this respect.

AFAIK, the viewer crash handler will kick-in in the main thread and the registers shown by gdb after the viewer stops will be unrelated with the crash (their value will have been altered by the crash handler, which is actually designed to dump the crash log, cleanly shut down the viewer and just exit the program with a non-zero value)…

If you know a way to do that under gdb while the viewer crash handler is active, let me know and I might give it a try, time permitting.

Topic		Replies	Views
OpenGL application sometimes crashes when switching workspace Linux	25	5005	December 13, 2013
[BUG] 440.59 driver crashed on switching virtual desktop with running OpenGL application Linux	0	462	February 7, 2020
387.34 almost works...libgl.so problem then more downgrade to Fedora 26 with no luck. Linux	13	1017	December 16, 2017
Some applications crash on exit with nvidia 375.10 Linux	10	3819	October 14, 2021
384.69 Broke KDE Screen Locker, Possibly Other QT Based Software On Linux Linux	20	6242	April 11, 2018
384.11 driver crashes kernel Linux	4	2120	January 12, 2018
Reporting graphics driver bugs? General Topics and Other SDKs	22	18292	November 15, 2021
340.76+: nvidia libGL crashes gdk-pixbuf-query-loaders, nvidia-settings Linux	10	2603	May 14, 2015
Natural Selection 2 keeps crashing when shooting Linux	12	5220	October 23, 2014
LTS kernel patch for Intel CPU vulnerability breaks nvidia driver Linux	12	8083	January 9, 2018

[BUG] crash in libnvidia-glcore.so.387.34 thread

Related topics