When I run CreatePixelShader on about 8 threads simultaneously may take 10ms or longer per thread.
The length of the shader code at this time is around 100kb.
The more threads that run CreatePixelShader, the longer it takes.
Is this phenomenon common on PCs?
I have read in NVIDIA literature that it should be “shader cache warming”, so I am sure this process to need long time. But, I am not sure how long if this is a time consuming process.
development environment
OS – Windows 10 pro
GPU – NVIDIA GeForce 2070 super
CPU – AMD Ryzen 7 3700X 8-Core Processor 3.59 GHz
Hello @tsumugi.you and welcome to the NVIDIA developer forums!
If you have an uncompiled shader, CreatePixelShader will first try to compile this shader, which is the part that takes so long. But any shader is created as a rendering resource before the actual render loop for a DirectX program, so the time this takes should not matter much.
If you instantiate different worker threads for shader compilation and do not see speedup, your shader code might be to complex for the shader compiler to manage in different threads.
Quite honestly, a 100kB pixel shader seems rather big to me, maybe there are possibilities for optimization?
Lastly, if you use tools like Visual Studio, there is the option to offline compile shaders and load them at runtime as compiled shader objects, .cso files. Maybe this is a possibility for you.
“Cache warming” happens in the GPU and has nothing to do with shader compilation. The GPU only has limited memory for shader code, that means if you use a lot of different shaders or very complex ones, the GPU will not keep all the code in GPU registers all the time, but the most often used code will reside in the shader cache. So in this case “cache warming” means the GPU takes a while to figure out which code to keep in cache and load it there.
We are passing the compiled code in D3DCompile to CreatePixelShader.
So the length of the compiled code is about 100kb.
Should I consider this data to be a little too large?
It depends of course on what the shader needs to compute. But in general for typical per pixel operations especially in games 100 Kilobytes seems big for a compiled shader, yes. How big is the source code? And did you check if the shader was maybe compiled with debug information enabled?
There should be no extra information in there, such as debugging information, etc. I didn’t give D3DCompile any flags to do that.
However, I could see that the post-compile code is significantly larger than the pre-compile code. That is a little strange.
It is like 60kb before compilation and 110kb after compilation.
Then I suggest trying to optimize the code. If you have a large number of conditionals in your source, that can cause shader bytecode to bloat for example. I am a bit rusty in the shader writing department, but you might want to search online for more of the common pitfalls in shader code.