Original bugreport: https://bugs.chromium.org/p/chromium/issues/detail?id=922029#c6
Original author: email@example.com
Steps to reproduce the problem:
- unrecognized dead code:
- call a costly function many times, don’t use the result.
- See perfs. Compare perfs with calling the function in a loop.
- unrolling vs loop perf:
- Call a costly function many times. Compare perfs with calling the function in a loop.
NB: if you have a good GPU, you should duplicated unrolled lines and multiply loop accordingly to aim 30 fps.
What is the expected behavior?
- max fps when no result is used. ( final fragColor = vec4(0) ).
- about the same perf for loop and manually unrolled version (at least not a perf ratio of 25 ).
What went wrong?
- dead code not recognized:
- when replacing shader end by fragColor = vec4(0) the cost keeps mostly the same (while get to 60fps with the loop).
- perf difference with manual unrolling:
on my machine, 30 explicit calls to func cost as much as a loop of 800 (func does depend of loop invariant ).
Did this work before? Yes NVIDIA driver 384.130
Note that it is probably an NVIDIA driver bug, possibly due to the new Nvidia GLSL/SPIR-V compiler :
I don’t have the bug on linux/nvidia with driver 384.130
I have the bug on linux/nvidia with driver 396.54
I have the bug on linux/nvidia with driver 390.77
I have the bug on linux/nvidia with driver 410.78
I don’t have the bug on windows/nividia with both trueOpenGL and Angle mode with driver 375.86
I have the bug on windows/nvidia trueOpenGL Angle=off with the ultra-last driver.
- Shadertoy version reproes with Windows 10, 416.34, GTX 1080, Chrome/73.0.3672.1, --use-angle=gl
- Repro observations from conf above: observing with Task Manager, loop version has GPU utilization of 85%, non-loop version has the GPU utilizaton of 100%