Memory mapping is very slow on Windows when using a Quadro card but not when using GeForce card

Hi,

If this is not the correct place please accept my apologies and point me in the right direction.

I have encountered some strange behaviour when memory mapping a file on Windows 10: on a system with a GeForce card mapping a large file (hundreds of GB) takes a few milliseconds, but on the same system with a Quadro card mapping the same file takes thousands of milliseconds. I have repeated the test with various driver versions with the same result.

This problem was first identified in a large OpenGL application we develop but I’ve since narrowed the problem down and created a small test application (source attached) to reproduce the behaviour. The application does the following:

  • Memory maps a file and measures how much time was spent in MapViewOfFile()
  • Calls LoadLibrary() for nvoglv64.dll
  • Memory maps the same file and again measures how much time was spent in MapViewOfFile()

When run on a system with a GeForce card MapViewOfFile() takes the same amount of time before and after nvoglv64.dll is loaded.

When run on the same system, but with a Quadro card, MapViewOfFile() takes 3 orders of magnitude longer after nvoglv64.dll is loaded.

Using xperf and the Windows Performance Analyser (trace attached) I’ve traced the problem to nvoglv64.dll, which appears to hook MapViewOfFile(). It appears that immediately after calling MapViewOfFile() a long time is spent doing ā€˜something’. As I don’t have access to symbols for nvoglv64.dll I unfortunately can’t see what ā€˜something’ is.

Can anyone please confirm if this is a bug or intended behaviour? If it is intended behaviour can you also please explain:

  • Why Quadro cards are affected but GeForce ones are not
  • Why this behaviour exists
  • If (and how) it is possible to mitigate it

Thanks

Ian

Attachements:
Source code for test program:
map_file_test.cpp (3.1 KB)

xperf trace showing delay after calling MapViewOfFile():
2020-12-18_18-07-10_Administrator.etl (7.2 MB)

Any intereset in this:

  • Bug?
  • Working as designed?
  • Can’t reproduce?
  • More information required?

I’ve encountered the exact same issue that you described, where memory mapping files takes significantly longer after nvoglv64.dll is loaded, but only when using a workstation GPU (in my case, an NVIDIA A4000/A5000). The problem does not occur with GeForce cards like the RTX 3060.

Interestingly, the issue disappears when running the test with a debugger attached, which makes the timing similar to that of the GeForce cards.

I’m using Windows 11 Pro,
NVIDIA Software Version 560.76

Observed behaviour for mapping a 10GB file:

Without nvoglv64.dll Loaded:
Time taken to memory map the file: ~20 microseconds
Time taken to unmap the file: ~10 microseconds

With nvoglv64.dll Loaded:
Time taken to memory map the file: ~8628 microseconds
Time taken to unmap the file: ~3015 microseconds

Are there any updates regarding this issue?

Thanks
Sebastian

Hi Sebastian,

Reassuring to hear it’s not just me, disappointing that no one seems to care. Unfortunately I never found a solution to this problem; only been three and a half years so there’s still hope :|

Ian

Obligatory xkcd: xkcd: Wisdom of the Ancients

1 Like

Hi Ian, just got a workaround from NVIDIA: Global Preset in the 3D settings needs to be set to ā€œWorkstation App - Dynamic Streamingā€ then the delay when memory mapping is gone.

Hi Sebastian, you’re a superstar, thank you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.