Memory mapping is very slow on Windows when using a Quadro card but not when using GeForce card

Hi,

If this is not the correct place please accept my apologies and point me in the right direction.

I have encountered some strange behaviour when memory mapping a file on Windows 10: on a system with a GeForce card mapping a large file (hundreds of GB) takes a few milliseconds, but on the same system with a Quadro card mapping the same file takes thousands of milliseconds. I have repeated the test with various driver versions with the same result.

This problem was first identified in a large OpenGL application we develop but Iā€™ve since narrowed the problem down and created a small test application (source attached) to reproduce the behaviour. The application does the following:

  • Memory maps a file and measures how much time was spent in MapViewOfFile()
  • Calls LoadLibrary() for nvoglv64.dll
  • Memory maps the same file and again measures how much time was spent in MapViewOfFile()

When run on a system with a GeForce card MapViewOfFile() takes the same amount of time before and after nvoglv64.dll is loaded.

When run on the same system, but with a Quadro card, MapViewOfFile() takes 3 orders of magnitude longer after nvoglv64.dll is loaded.

Using xperf and the Windows Performance Analyser (trace attached) Iā€™ve traced the problem to nvoglv64.dll, which appears to hook MapViewOfFile(). It appears that immediately after calling MapViewOfFile() a long time is spent doing ā€˜somethingā€™. As I donā€™t have access to symbols for nvoglv64.dll I unfortunately canā€™t see what ā€˜somethingā€™ is.

Can anyone please confirm if this is a bug or intended behaviour? If it is intended behaviour can you also please explain:

  • Why Quadro cards are affected but GeForce ones are not
  • Why this behaviour exists
  • If (and how) it is possible to mitigate it

Thanks

Ian

Attachements:
Source code for test program:
map_file_test.cpp (3.1 KB)

xperf trace showing delay after calling MapViewOfFile():
2020-12-18_18-07-10_Administrator.etl (7.2 MB)

Any intereset in this:

  • Bug?
  • Working as designed?
  • Canā€™t reproduce?
  • More information required?

Iā€™ve encountered the exact same issue that you described, where memory mapping files takes significantly longer after nvoglv64.dll is loaded, but only when using a workstation GPU (in my case, an NVIDIA A4000/A5000). The problem does not occur with GeForce cards like the RTX 3060.

Interestingly, the issue disappears when running the test with a debugger attached, which makes the timing similar to that of the GeForce cards.

Iā€™m using Windows 11 Pro,
NVIDIA Software Version 560.76

Observed behaviour for mapping a 10GB file:

Without nvoglv64.dll Loaded:
Time taken to memory map the file: ~20 microseconds
Time taken to unmap the file: ~10 microseconds

With nvoglv64.dll Loaded:
Time taken to memory map the file: ~8628 microseconds
Time taken to unmap the file: ~3015 microseconds

Are there any updates regarding this issue?

Thanks
Sebastian

Hi Sebastian,

Reassuring to hear itā€™s not just me, disappointing that no one seems to care. Unfortunately I never found a solution to this problem; only been three and a half years so thereā€™s still hope :|

Ian

Obligatory xkcd: xkcd: Wisdom of the Ancients

1 Like

Hi Ian, just got a workaround from NVIDIA: Global Preset in the 3D settings needs to be set to ā€œWorkstation App - Dynamic Streamingā€ then the delay when memory mapping is gone.

Hi Sebastian, youā€™re a superstar, thank you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.