Bug report + reproducer: Driver crash, Windows only

Hello,
on windows app we are developing was crashing randomly with message box:
“NV OpenGL driver detected a problem with display driver and is unable to continue. … Error code: 8”

Our app heavily relies on bindless graphics - both bindless textures and bindless buffers. The crash appears only when a lot of textures (~600) and buffers is created and made resident (~200), swapbuffers called, and then any shader program is binded to draw anything. Double checked video memory consumption, and it’s fine - 1200Mb VRAM is still free during crash (total VRAM - 4GB).

We managed to create minimal reproducer (used GLintercept to capture calls, and hand-edited code then, to make it compile + embeded shaders into source file). Reproducer simply creates window, OpenGL context (4.4) and then starts loading shaders, creating buffers and textures and making them resident. All resource creation sequences were captured from real app (reduced to minimal reproducible case before capture).

Reproducer can be compiled in two modes (set in config.hpp):
with #define CREATE_CTX_AND_WINDOW_IN_MAIN_THREAD 0 - crash will be reproduced, window and OpenGL context will be created and made current in second thread (render thread, as in our app)

with #define CREATE_CTX_AND_WINDOW_IN_MAIN_THREAD 1 - crash will not happen, windows and OpenGL context will be created in main thread and made current in second thread, resources will be allocated in second thread.

But creating window and OpenGL context on other thread, than main, AFAIK is normal. Our app was doing it for a long time without any problems, (on Linux btw everything works perfect).

Even if context creation and resources allocation will be done in Main Thread (no second thread created) - crash still happens.

Bug report checklist: (as recommended in https://devtalk.nvidia.com/default/topic/790452/general-graphics-programming/reporting-graphics-driver-bugs-/)

  1. Windows 7 PRO SP1 x64 ENG
  2. Quadro K5000
  3. stable crash appears on 353.06, 353.30, 353.62. But we’ve experienced similar issues while using bindless graphics and ~3GB VRam usage (during textures streaming) on other driver versions (and other hardware - TITAN X, K6000, GF980), but are unable to make a solid reproducer for those cases :(
  4. Appears with any app profile, any number of monitors connected (tested with 1, 2, and 3 monitors attached).
  5. Reproducer can be downloaded from https://drive.google.com/file/d/0B_i3TlILtfc7TThKM1RoT20zelk/view?usp=sharing (7-Zip archive)
  6. Launch repro-crash-textures\x64\Release\reproducer.exe from attached archive.
    In case msvcp110 and msvcr110 C++ redists are required, they can be copied from repro-crash-textures\redists to repro-crash-textures\x64\Release
    Repro could be compiled with visual studio 2012 (release x64 configuration only) (in case one wants to switch between two options mentioned above).
  7. Wait until message box with "NVIDIA OpenGL … Error code: 8 … " message appears.
    Screenshot - https://drive.google.com/file/d/0B_i3TlILtfc7Ul8yVTVITHpiRUk/view?usp=sharing
  8. Not applicable

Are there any caveats when using bindless textures and buffers, like maximum number of buffers made resident per frame, or maximum number of textures made resident per frame?

Thanks for your attention.

UPDATE:
Our app actually consists of two parts - launcher executable and rendering engine dynamic library (where crash happens), thats because engine is intented as a third party software, so user can load and control it without a launcher. Engine creates separate thread with opengl context (actually there were two uploading and rendering ctx) where all rendering happens. So even if user loaded engine from main thread, OpenGL context will created in another thread. But while preparing reproducer, rendering thread was eliminated.
So scenario for simplified app was like this:

  1. launcher.exe starts
  2. engine.dll is loaded (implicit linking, so no explicit LoadLibrary in launcher.exe)
  3. launcher.exe calls one function from engine.dll (which had code equivalent to in reproducer`s) to create context and allocate resources in Main Thread
  4. crash happens.

BUT if engine.dll is modified to :

  1. create OpenGL context in Main Thread (inside call from launcher.exe)
  2. then start second thread, and made mentioned OpenGL context current to second thread
  3. allocate all resources in second thread
  4. crash no longer happens.

AND IF engine.dll is modified to:

  1. start second thread
  2. create OpenGL context in second thread and make it current
  3. allocate all resources in second thread
  4. crash happens again.

Hope those additional facts may be usefull.

Thanks for bring this into our attention. I filed a bug in NVIDIA bug data base.

Driver release 353.62 still got the issue :(

With driver release 355.58 reproducer app is no longer crashing OpenGL driver, nor our engine does.
Seems like issue is resolved in 355.58 :)