subgroupBallot() from GL_KHR_shader_subgroup returns apparently incorrect results

As the title suggests, I’m fairly certain that the GLSL compiler in the driver (version 511.65, tested on a Titan RTX) sometimes produces incorrect code for subgroupBallot(). The problem seems to be due to overly aggressive optimization moving the ballot call from the intended scope to another. The following shader is the simplest reproducing example I found; I’m fairly certain the extension doesn’t imply that the barriers would be required, but this shows that no standard precautions prevent the issue. Also note the seemingly no-operation of packing and unpacking the vector.

#version 460
#extension GL_KHR_shader_subgroup_basic : require
#extension GL_KHR_shader_subgroup_ballot : require
#extension GL_ARB_gpu_shader_int64 : require

layout(local_size_x = 32) in;
buffer bind_block(results) {
    uint result[2];
};

void main() {
    uvec2 bits = uvec2(subgroupBallot(gl_SubgroupInvocationID < 16).x, 0);
    subgroupBarrier();
    bits.y = subgroupBallot(gl_SubgroupInvocationID < 16).x; // explicitly writing 0xffffu here causes result[1] to be 0xffffu instead of 0xffffffffu
    subgroupBarrier();
    bits = unpackUint2x32(packUint2x32(bits));
    if (subgroupElect()) {
        result[0] = bits.x; // this writes 0x0000ffffu, as expected
        result[1] = bits.y; // this writes 0xffffffffu, expected another 0x0000ffffu (ballot with the same predicate in the same scope)
    }
}

I’ve found a workaround that I’m using for now, it seems that the compiler respects the volatile keyword enough to not touch the ballot:

uvec4 safeBallot(bool value) {
    volatile uvec4 result = subgroupBallot(value);
    return uvec4(result);
}
#define subgroupBallot safeBallot