Why __syncwarp is necessary in undivergent warp reduction?

I understand why there maybe diverged warp activity in statement B. It’s caused by the new feature introduced in Volta called independent thread scheduling. Because there is a if statement in the first line and warp is not guaranteed to re-convergence in line 2.
But my question is: If there is no code divergence before exchange data through shared memory, is it necessary to use __syncwarp?