Mysterious CPU Usage Issue

With a rather simplistic rendering pipeline around a complex engine (Afraid it’s so much source code I can’t share it easily) I am seeing a very odd CPU usage issue on Windows 10 64-bit that exhibits itself on my multimonitor system which I can’t figure out.

The situation is as follows.

I noticed my game was taking 12% of my CPU when it should be around 1 or 2%. Profiling it in Visual Studio 2017 I find that most the CPU time is spent in NVidia calls. I then use nSight to profile and it comes to the same conclusion, that nothing is in the OpenGL API nor my code that accounts for this and something is going crazy on this thread NVidia creates. The baffling part is if I simply move the window to the other screen from left screen to right screen the issue goes away! The thread taking the CPU drops to near 0%.

What’s even stranger is if I alt-tab from the window it flickers black for a moment on the left screen but this issue doesn’t happen on the right screen.

My question is if anyone can come up with any reason this might be happening or encountered similar behavior?

I have an NVidia 1060 GTX with latest drivers. This is a 64-bit OpenGL program.

For note, I’m being told the starting address for the thread taking all the CPU is “Nvoglv64.DLL!DrvValidateVersion”

More information. It seems this mysterious CPU usage happens when the window is on the screen that is not set as my primary display monitor. When I swap which is primary the CPU usage changes to become high on the corresponding monitor that is not primary.

Seems like even outside samples cause the same behavior, so not anything in my code or initialization.

If anyone needs a sample for aiding in explanation, here is one I tried that exhibits the same behavior: Tutorial 9: Ambient Lighting

Hi RevenantBob,

I briefly tried the example you linked but I don’t see obvious CPU overhead.
Since “DrvValidateVersion” is the very first entry point of the graphics driver, the problem can be caused by a lot of system specific issue.

Can you provide more details to isolate the problem?

  1. Does it reproduce on all other OpenGL applications?
  2. Does it reproduce on single monitor setup or double monitors with duplicated mode?
  3. Do you have other Nvidia/Intel/AMD GPU enabled on your motherboard? You can check that in Device Manager → Display Adapters.
  4. Providing exact driver number and Visual Studio Diagnostics report can also help.

Yes. I even tried running 2016 Doom, and it replicates the issue. I noticed it even produces the momentary flicker when it’s click activated on the offending monitor. It doesn’t have DrvValidateVersion has the offending thread, however. But the CPU usage is very different based upon which monitor it is running on.

If I setup display to “Show only on Monitor 1 or 2” (Single Monitor) then this issue DOES appear on either monitor.
If I setup display to “Duplicate these Displays” then the issue does NOT happen and the software behaves as expected.

No other display adapters. My system configuration is:

Driver: 385.69 released 9/20/2017.
Display adapter: EVGA GeForce GTX 1060 6GB SSC GAMING ACX 3.0, 6GB GDDR5
Motherboard: MSI Z170A GAMING M5 LGA 1151
Processor: Intel Core i7-7700K Kaby Lake Quad-Core 4.2 GHz LGA 1151 91W

One Monitor is plugged into HDMI and the other into DVI if that matters. But like I said, it has the issue on either based on monitor configuration.

I have created two diagnostic reports for my engine (client.exe is the executable). These were generated with Visual Studio 2017 so will likely require Visual Studio 2017 to examine.

http://www.retrosmack.com/opengl/faulty_screen.diagsession
http://www.retrosmack.com/opengl/working_screen.diagsession

To produce these reports I let it run for 1 minute. It starts on the faulty screen in both cases (Where CPU is higher) but I use WINKEY+SHIFT+ARROW to switch the window to the working monitor for the “Working_screen” report. Notice that the CPU usage in the faulty diag is CONSIDERABLY higher.

The detailed report on the faulty report says that “NtGdiDdDDIEscape” is causing the biggest overhead. And it still has largest overhead in the working report.

I’m not certain what that kernel call does, however.

If you need more information, I’m happy to provide it!

Hi RevenantBob,

Thank you for the latest input.
It appears the extra CPU usage is showing up whenever you enter the full-screen exclusive flipping mode on the rendering display. (duplicated mode will not trigger this mode on both displays).

We will file an internal bug to track this issue.

Is there a way to disable flipping mode programatically?