Performance advantages of direct_state_access

Hello,

I was wondering, does DSA bear any additional optimizations that the driver can exploit which go beyond the “less GL calls = less overhead” gain.
Basically, are there potential stalls / implicit synchronizations of eg. buffer objects that could be avoided by using DSA instead of the classic bind-modify-convention?

You would think there would be perf advantages. But, everything I’ve read says that it is the same or slightly slower. So, it’s mostly a clearer, more convenient path for the programmer at this point.

It’s likely that the old method is faster simply because it has been around longer and had more time for drivers to work on optimizing. DSA might get faster in the future. Or, not…