I am here to report a potential bug:
It seems like NVIDIA’s driver can commit up to several megabytes of RAM (CPU ram, not VRAM) for linked GLSL shader programs. As one would suspect, the amount of consumed RAM is somewhat proportional to the complexity of the shader program. However, it still seems much higher than it needs to be. For example, some of our more complex shader programs easily exceed 2MB. When dealing with high quantity of shaders, this becomes a huge problem.
In our application we generate shaders dynamically and they often end up being quite complex (Example Vertex and Fragment shader). Furthermore, we deal with large amounts of shaders, in the range of 5k to 20k. The problem we are facing is that the graphics driver allocates up to 15GB of RAM just for compiled shaders. The question is, is this intended behavior or a bug? We already double and triple checked to make sure this is not a mistake on our end.
I wrote a test application to demonstrate the issue. Source is available here (VS2015). It links one set of vertex + fragment shader 1000 times and then prints the amount of RAM commited by the application. The application itself does not allocate any extra memory. Additionally, the .zip comes with multiple sets of example shaders taken from our application to see the difference in RAM usage. For more details see main.cpp
Some other observations I made:
Occurs on all driver versions and all Windows versions
RAM usage is proportional to complexity of shader (no surprise here)
Conditionals (if clauses and '?' operator) seem to massively increase RAM usage and compile times
The size of uniform buffer arrays only slightly affect RAM usage
Detaching and deleting shaders (glDetachShader+glDeleteShader) after glLinkProgram helps only a bit
Calling glDeleteProgram() correctly releases all memory, indicating there is no leak
Same problem occurs when the shader programs are loaded via glProgramBinary
Thanks in advance!
Edit: We ran the test application on a different set of GPUs from all three vendors (Nvidia, AMD and Intel). Results are here.
It’s likely that what is needed here is much better optimization or compression rather than only dealing with it if it’s technically a bug. Intel according to the tests of the OP uses more than double the cache of NVIDIA itself. It comes to reason that what may be needed is for NVIDIA to find ways to compress the data or omit redundant data, rather than only treat it if it deems it a critical bug, because it can be very detrimental to its users if it does nothing since very few people nowadays justify more than 8 to 16GB of RAM.
Something that should also get mentioned is that there are reports that say say it isn’t present in earlier driver(s). It seems that the driver of [url]GeForce Game Ready Driver | 378.49 | Windows 10 64-bit | NVIDIA of January 23rd 2017 has usage that’s close to other opengl drivers on other brands. Hope it gets fixed! I’m trying to make a benchmark, but I’m basically capped with my 16GB ram!
Here is a message I got from the moderator who replied on this thread. For those who want to know:
"I expect this to simply appear in a driver release when it’s done, so all I can recommend is to check newer driver releases regularly. Just read through their release notes for this issue and test if the RAM usage has been improved.
Note that it is not unusual to take multiple months from an initial bug report to a released driver with a fix, depending on the priority and complexity of the required work plus quality assurance and certification processes which need to be run for each release."