Memory leak when creating/deleting WGL context on worker thread on Quadro K2000M

Please bear with me if this sounds like an overly obscure problem. I’m hoping that understanding what’s happening in this case will help us solve a more serious problem we’re seeing where our OpenGL application will die on pretty much any Nvidia GPU after creating/closing 60-100 MFC windows. At this point wglMakeCurrent will start failing and we start getting sporadic GL_OUT_OF_MEMORY errors. We haven’t been able to reproduce this behavior in a stand-alone test case, but we have been able to reproduce the following strange behavior.

We have a test program that does the following in a loop:

  1. Create a Win32 window
  2. On a worker thread create a WGL context, make it current, do a clear and swap buffers, then delete the context.
  3. When the worker thread returns, the main thread deletes the window and we loop back to 1.

When running this test on a laptop w/Quadro K2000M driver version 332.76 the memory usage for the application rises to about 1 GB after 30-50 iterations. At that point it seems to stabilize and stop increasing. If the window creation/destruction is also moved to the thread function then we don’t see this huge memory footprint (which is what we see on AMD and Intel regardless of where the window creation happens).

The test program (VS 2012 project) can be downloaded here https://dl.dropboxusercontent.com/u/69927239/nvidia_leak.zip

In our real application we similarly create windows on the main thread, and contexts on worker thread (each window has an associated worker thread that is where all OpenGL calls for that context are made). When we force everything to happen on the main thread we seem to be able to create more windows before things start failing but eventually we get to the same state where nothing works anymore. This has been verified on a GTX 760 w/driver 335.23. Both machines are running Windows 8 64 bit.

If I’m mistaken an OpenGL context requires a device context( HDC on Windows…forgot the Linux comparable defition ), regardless of if you are actually using the context for rendering. Since the HDC is tied to the Window, I’m surprised the application even survive a single invocation of your loop. If the context is active I’m assuming that the window also have to be active( not deleted )

The loop goes like this:

create window
get device context
create wgl context
clear / swapbuffers
delete wgl context
release device context
destroy window

It’s just that the window creation/destruction happens on the main thread, and everything else on the worker thread.

Some more information on the other problem I mentioned in the real application. Normally we make the context for each window current at the beginning of each frame for that window, and clear the current context at the end
This appears to be what causes the problem on Nvidia. If we just make the context current once and leave it current then we don’t see everything fall apart after 60-100 windows are opened and closed. There are some practical reasons why it’s difficult for us to do things this way in general, and it seems to pretty clearly be a bug in the Nvidia driver. AMD and Intel cards can survive thousands iterations of this process with no problems and no increase in the process memory footprint.

EDIT:

This problem sounds pretty similar: https://devtalk.nvidia.com/default/topic/527193/general-development/does-wglmakecurrent-leak-memory-in-a-worker-thread-/

I’ve forwarded the reported issue to the OpenGL driver team and it’s going to be investigated.

Thanks that’s great news. This is a pretty serious issue for one of our customers (as the main problem is happening across all Nvidia devices we’ve tested, not just Quadro).

We’re going to put in a temporary workaround to avoid changing contexts so frequently for now, but a real fix at some point would be great.

Please feel free to contact me directly (I assume you can get my email from my forum account?) if I can provide any more information about this.

Thanks,

Evan

Right, I’ve forwarded your contact information as well.
They’ll ask if there is anything more required. We’ll see. I’m copied on the bug report and get the status.

I haven’t been able to reproduce the MakeCurrent problem outside of our application but I did wire up a version that just starts creating windows immediately on start-up and reproduces the problem easily.

There’s a VS 2012 binary here: https://dl.dropboxusercontent.com/u/69927239/hoops3dpartviewerstat.zip

Just run the executable and it should either show a message box about wglMakeCurrent failing or just crash after 60-100 iterations. On the mobile Quadro it seems to happen exactly at 60, whereas the GTX 760 goes a bit past 100. The problem seems to happen faster when the windows are larger, which is why everything is maximized in the test. I’ve been testing all this at 1920x1080.