Performance advantages of direct_state_access

Hello,

I was wondering, does DSA bear any additional optimizations that the driver can exploit which go beyond the “less GL calls = less overhead” gain.
Basically, are there potential stalls / implicit synchronizations of eg. buffer objects that could be avoided by using DSA instead of the classic bind-modify-convention?