Thank you for your time and attention!
I am a software developer working on a software which includes multiple, built-in, real-time OpenSceneGraph / OpenGL previews. In the recent months, several costumers reported that their CPUs stall heavily when using our software. The affected computers always feature specifications similar to the below:
- Intel multi-core CPU (e.g. Xeon, Core i7)
- NVIDIA Quadro K600 / K620 / FX 770M / 1000M / 2000M
- Windows 7 Pro x64 / Windows 7 Ultimate x64 / Windows 10 Pro
- Various NVIDIA drivers in the range of 353.xx to 354.xx
Thus, we have set up the following test system:
- Intel Core i7-3930K
- NVIDIA Quadro K600 by PNY
- Windows 7 Ultimate x64
- NVIDIA driver 354.56
On said system, the symptoms are the following:
- When running our software with previews enabled, the CPU usage sporadically spikes at 100 % on all cores for about 1 to 10 seconds.
- In this time, our software as well as most other processes are unresponsive.
- Apart from these peaks, the CPU usage is < 10 %.
- The frequency of the peaks varies from about 15 minutes to 20 hours, depending on the test case / software configuration / system configuration (please see details below).
- The following is a screenshot of the CPU usage history right after such a peak: https://www.dropbox.com/s/h6pnbq3fgkc5uoq/Performance_Graph.png?dl=0
- Furthermore, Process Explorer tells me that the complete CPU usage is split amongst several threads of our application during such a peak.
I searched for references to similar problems in the forums of NVIDIA, OSG, and OpenMP, and followed some sparse hints without success (keywords: “CPU 100%”, “Quadro” in conjunction with “OpenMP”).
In order to track down the origin of the issue, I did a number of tests. Here are the main test results:
Mandatory circumstances to reproduce the problem:
- Rendering with the Quadro card
- When the graphics card in the above-mentioned test system is replaced with a GeForce GTX 670 (driver 361.43), the problem disappears.
- There are neither reports from our team nor from our customers about similar problems with other graphics cards.
- Starting debug builds or release builds with the debugger attached does not reproduce the problem (I am using Visual Studio Professional 2013 Update 5).
- Our software consists of multiple tiers, finally linking against OpenSceneGraph 3.0.1 and Qt 4.8.7.
- The previews are basically osgQt::GLWidgets driven by osgQt::GraphicsWindowQt. After creation, each GLWidget is embedded in a common QWidget.
- Furthermore, we use osgViewer::CompositeViewer in single-threaded mode.
- If I comment out the creation of the GraphicsWindowQt (hence skipping OpenGL context creation and deactivating the previews at compile time), the problem does not show up (60+ hours test case).
- In contrast, one can disable the previews at run-time by which already initialized OSG resources (and presumably OpenGL contexts) are cleaned up again. In this case, the peaks still occur, although less frequently.
- I had other test cases where the previews / contexts only got created, but not triggered continuously, and the problem showed up frequently.
- In order to exclude OSG and Qt, I replaced the preview programmatically with a dummy window using wglCreateContext() and just displaying one gluSphere(), but the peaks still appeared (once after 8.5 hours).
Question 1: Is this behaviour indicating issues with the OpenGL context creation or beyond?
Circumstances to increase the frequency of the problem:
- Increasing overall multi-threading
- Apart from the previews, our software makes heavy use of CPU-side task parallelism as well as data parallelism (for color calculations, networking, etc.). Therefore, we create own worker threads and use OpenMP (#pragma omp parallel for).
- In the tests it turned out that when increasing the count, the workload and especially the frame rate of the threads makes the problem more reproducible (up to < 15 minutes).
- In contrast, deactivating OpenMP via compiler option even made the problem disappear (26 hours test case).
Question 2: Apart from common run-time incidents, I cannot see any dependencies between our use of multi-threading and the presence of a Quadro card. Am I missing any known issues referring to Quadro cards and e.g. OpenMP?
Circumstances not influencing the problem:
- Using osgViewer::CompositeViewer in the various multi-threaded modes does not change the situation.
- I installed the NVIDIA drivers from scratch and usually tested with the base profile default settings, but I also played with some specific options in the Control Panel. Especially, I tried the Threaded Optimization option with On and Off, but the peaks always occured after a short time.
Question 3: Do you have any thoughts on the above-mentioned observations, suggestions for further test cases to single out specific factors, or ideally a fix for the problem?
It is highly appreciated.
Thank you very much!