Java crashes when the debugger library is loaded

Dear forum,

I just discovered the Linux Graphics Debugger and was quite stoked to try it out, but my OpenGL-using program is written in Java, and it appears that the JVM predictably crashes whenever it is run with the debugger library loaded.

The reason for the crash seems to be that code inside the debugger library calls a NULL function pointer – this is the stacktrace at the time of the crash:

#6  <signal handler called>
#7  0x0000000000000000 in ?? ()
#8  0x00007ffff716a052 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#9  0x00007ffff716abdd in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#10 0x00007ffff7165a6a in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#11 0x00007ffff6abd431 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#12 0x00007ffff61f8c09 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#13 0x00007ffff6c4428b in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#14 0x00007ffff6c3dc92 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#15 0x00007ffff6c49c27 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#16 0x00007ffff6c50013 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so

(The base load address of the library in this case was 0x7ffff5ef9000.)

Of course, it’s quite difficult for me to debug this further myself, since the debugger library lacks symbols and all.

Is there anything I can do to further the debugging of this problem, or will I have to give up on profiling Java programs?

In the unfortunate case that I have only to give up, I figure that all I actually really need from the Debugger is the access it provides to the performance counters listed at <http://docs.nvidia.com/linux-graphics-debugger/content/developertools/desktop/linux_graphics_debugger/lgd_perf_counters.htm>, so I wonder how these can be accessed without the help of the Debugger program.

For the record, I have now also tried this with the version 2 of the debugger, with similar, though not identical results. This time, the debugger library appears to crash in an strlen() call:

#0  strlen () at ../sysdeps/x86_64/strlen.S:137
#1  0x00007ffff7149ac0 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#2  0x00007ffff6ecaf53 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#3  0x00007ffff70e4d94 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#4  0x00007ffff6ed1b74 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#5  0x00007ffff6ed1e4d in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#6  0x00007ffff6ecfa3a in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#7  0x00007ffff5fb0f90 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#8  0x00007ffff5e973ef in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#9  0x00007ffff5fc1670 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#10 0x00007ffff5e92696 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#11 0x00007ffff5e93f92 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#12 0x00007ffff5e941cc in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#13 0x00007ffff5e90f4b in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#14 0x00007ffff5e912d7 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#15 0x00007ffff6bae2b9 in ?? () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#16 0x00007ffff6ba25d9 in glXGetProcAddress () from /home/fredrik/.tgd/libs/libNvidia_gfx_debugger.so
#17 0x00007fff862a6c8f in Java_jogamp_opengl_x11_glx_GLX_dispatch_1glXGetProcAddress0__Ljava_lang_String_2J () from /tmp/jogamp_0000/file_cache/jln2157706239543529918/jln5801226942555331840/libjogl_desktop.so

The base load address of the library in this case was 0x7ffff5bd5000.

Also, the argument passed to strlen() causing it to crash was 0xffffffffffffffff.

Hi Dolda2000,

Could you please provide more information?

  • Linux OS distribution and version?
  • GPU?
  • GPU Driver version?
  • Window system?
  • LGD version?

Could you please also provide a sample of your OpenGL program written in Java?

Thanks!

Cody

Ah yes, sorry.

I’m using Debian (stable, wheezy, 7.0, pick your name), a GTX 750 (not Ti), the driver from Debian’s non-free APT archive (which is 340.96), bog-standard X11 setup (using StumpWM, in the odd event that the WM would matter). The exact LGD version is 2.0.21208850. My JVM is Oracle’s JDK distribution, version 1.8.0_102-b14.

The program is available via JNLP: <http://www.havenandhearth.com/java/hafen.jnlp>

When launching it over JNLP, the JNLP launcher works fine, but the JVM will crash as soon as the actual program tries to start. That is, before it has managed to open a window.

Hi Dolda2000,

I don’t have a Debian environment at the moment. But I tried with Ubuntu 16.04, GTX1080, Driver 375.20, the program can be launched successfully with

LD_PRELOAD=<libNvidia_gfx_debugger.so path> javaws hafen.jnlp

And LGD can successfuly attach to the process.

I noticed your driver version is a bit old, could you please update your driver and try again?

Hi,

I’ve been trying to get the graphics debugger running, but I’m getting the same error as above (the strlen one).

I’m running Fedora 25, with driver version 378.13 on a Geforce 1060 (6GB). Using LGD v2.0.21208850.

The only thing I can really add to bug report is that the previous strlen call was for the string “/tmp”, and the strlen that’s crashing is passing in a pointer to 0xb in the RAX register. This all seems to occur after it looks at the TMP and TEMP environment variables.

#0  strlen () at ../sysdeps/x86_64/strlen.S:137
#1  0x00007ffff7144ac0 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#2  0x00007ffff6ec5f53 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#3  0x00007ffff70dfd94 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#4  0x00007ffff6eccb74 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#5  0x00007ffff6ecce4d in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#6  0x00007ffff6ecaa3a in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#7  0x00007ffff5fabf90 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#8  0x00007ffff5e923ef in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#9  0x00007ffff5fbc670 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#10 0x00007ffff5e8d696 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#11 0x00007ffff5e8ef92 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#12 0x00007ffff5e8f1cc in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#13 0x00007ffff5e8bf4b in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#14 0x00007ffff5e8c2d7 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#15 0x00007ffff6ba92b9 in ?? () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#16 0x00007ffff6b9cd81 in glXQueryExtension () from /home/stephen/.tgd/libs/libNvidia_gfx_debugger.so
#17 0x00007ffff57393e7 in _glfwInitGLX () from /lib64/libglfw.so.3
#18 0x00007ffff5736385 in _glfwPlatformCreateWindow () from /lib64/libglfw.so.3
#19 0x00007ffff572ffe2 in glfwCreateWindow () from /lib64/libglfw.so.3
#20 0x000000000041c2bd in main () at test.c:294

My code up to the point it crashes is:

int windowWidth = 800, windowHeight = 600;
int main()
{
    if (!glfwInit())
    {
        // Initialization failed
        printf("GLFW init failed\n");
        return -1;
    }

    // Create the window
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 4);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 5);
    glfwWindowHint(GLFW_OPENGL_DEBUG_CONTEXT, GLFW_TRUE);
    glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
    GLFWwindow* window = glfwCreateWindow(windowWidth, windowHeight, "OpenGL", NULL, NULL);

Regards
elFarto

Hi elFarto,

I can confirm this is a known issue, and it has been fixed in the next release version of LGD in the coming weeks.

Some here.

I use Kubuntu 16.04, GTX 970 (with driver 390.25), Java: Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)

When run with libNvidia_gfx_debugger.so, JVM is crushed.

Crush core dump and log is here: https://yadi.sk/d/hEyDYQB93TEfNv