Memory mapping is very slow on Windows when using a Quadro card but not when using GeForce card

Hi,

If this is not the correct place please accept my apologies and point me in the right direction.

I have encountered some strange behaviour when memory mapping a file on Windows 10: on a system with a GeForce card mapping a large file (hundreds of GB) takes a few milliseconds, but on the same system with a Quadro card mapping the same file takes thousands of milliseconds. I have repeated the test with various driver versions with the same result.

This problem was first identified in a large OpenGL application we develop but I’ve since narrowed the problem down and created a small test application (source attached) to reproduce the behaviour. The application does the following:

  • Memory maps a file and measures how much time was spent in MapViewOfFile()
  • Calls LoadLibrary() for nvoglv64.dll
  • Memory maps the same file and again measures how much time was spent in MapViewOfFile()

When run on a system with a GeForce card MapViewOfFile() takes the same amount of time before and after nvoglv64.dll is loaded.

When run on the same system, but with a Quadro card, MapViewOfFile() takes 3 orders of magnitude longer after nvoglv64.dll is loaded.

Using xperf and the Windows Performance Analyser (trace attached) I’ve traced the problem to nvoglv64.dll, which appears to hook MapViewOfFile(). It appears that immediately after calling MapViewOfFile() a long time is spent doing ‘something’. As I don’t have access to symbols for nvoglv64.dll I unfortunately can’t see what ‘something’ is.

Can anyone please confirm if this is a bug or intended behaviour? If it is intended behaviour can you also please explain:

  • Why Quadro cards are affected but GeForce ones are not
  • Why this behaviour exists
  • If (and how) it is possible to mitigate it

Thanks

Ian

Attachements:
Source code for test program:
map_file_test.cpp (3.1 KB)

xperf trace showing delay after calling MapViewOfFile():
2020-12-18_18-07-10_Administrator.etl (7.2 MB)