Bug report: GLSL compiler bug - simple tessellation shader - Nvidia Linux Driver 370.28

It seems to me that I’ve stumbled upon a GLSL compiler bug in Nvidia Linux Driver 367.35. I’m attaching the following files:

  • wine-tess.trace - an apitrace file which can be used to reproduce the issue, the trace can be replayed using apitrace [1];
  • tess.glsl - the source code of Tessellation Control Shader which produces unexpected results;
  • broken-shader-binary.bin - a binary returned from glGetProgramBinary() on Nvidia driver;

The attached trace is quite simple and small. It’s recorded from a very basic program which uses tessellation shaders. This program is expected to render a quad. However, when run with Nvidia drivers the result is a black screen. The expected image was obtained when the program was run on a different GPU.

The returned program binary from glGetProgramBinary() on Nvidia drivers contains the source code for shaders in NVIDIA’s 5th generation assembly instruction set (GL_NV_gpu_program5). This source code for tess.glsl looks like an incorrect translation from the source GLSL, based on this I suspect that the issue is in the GLSL compiler.

Let’s try to analyze the following part of the generated assembly:

MOV.U R2.x, {0, 0, 0, 0}; // R2.x = 0
MOV.U R0.x, primitive.invocation; // R0.x = InvocationID
MOV.U R2.y, {1, 0, 0, 0}.x; // R2.y = 1
MOV.F lmem[0], vertex_attrib[R0.x][R2.x]; // lmem[0] = attrib[0]
MOV.F lmem[1], vertex_attrib[R0.x][R2.y]; // lmem[1] = attrib[1]
MOV.F R0, lmem[0].xyzw; // R0 = attrib[0]
MOV.F R1, lmem[1].xyzw; // R1 = attrib[1]
MOV.F result.attrib[0], R0; // result.attrib[0] = attrib[0]
MOV.F lmem[R2.x].x, {1, 0, 0, 0}; // lmem[0].x = 1
// To this point everything looks ok
MOV.F result.patch.tessouter[0].x, R0; // tessouter[0].x = R0.x = attrib[0].x - this is wrong

The shader writes attrib[0] to tessotuer[0].x while it should write 1.0 to tessouter[0].x. The value of 1.0 was correctly loaded to lmem[0].x but the new value of lmem[0].x is not loaded to R0.x which is used as a source of tessouter[0].x. Everything would be correct if an additional “MOV.F R0, lmem[0].xyzw” was inserted before the last instructions, but R0 keeps the old value of lmem[0] and this value is then written to tessouter[0].x. It looks like incorrect optimization in the GLSL compiler.

tess.glsl:

#version 440
#extension GL_ARB_tessellation_shader : enable
#extension GL_ARB_shader_bit_encoding : enable
#extension GL_ARB_texture_query_levels : enable
#extension GL_ARB_uniform_buffer_object : enable
#extension GL_EXT_gpu_shader4 : enable
layout(vertices = 4) out;
in vs_hs_iface { vec4 hs_in[32]; } hs_in[];
vec4 hs_out[32];
vec4 R0;
vec4 tmp0;
vec4 tmp1;
out hs_ds_iface { vec4 ds_in[32]; } ds_in[];
void setup_hs_output(in vec4 shader_out[32])
{
    ds_in[gl_InvocationID].ds_in[0].xyzw = shader_out[0].xyzw;
    ds_in[gl_InvocationID].ds_in[1].xyzw = shader_out[1].xyzw;
}
vec4 invocation_id = vec4(intBitsToFloat(gl_InvocationID), 0, 0, 0);
void hs_fork_phase0(vec4 phase_id)
{
    R0.x = (phase_id.x);
    hs_out[floatBitsToInt(R0).x + 0].x = (1.00000000e+00);
}
void hs_fork_phase1(vec4 phase_id)
{
    R0.x = (phase_id.x);
    hs_out[floatBitsToInt(R0).x + 4].x = (1.00000000e+00);
}
void hs_control_point_phase()
{
    hs_out[0] = hs_in[gl_InvocationID].hs_in[0];
    hs_out[1] = hs_in[gl_InvocationID].hs_in[1];
}
void main()
{
    hs_control_point_phase();
    setup_hs_output(hs_out);
    barrier();
    for (int i = 0; i < 4; ++i)
    {
        vec4 phase_id = vec4(intBitsToFloat(i), 0, 0, 0);
        hs_fork_phase0(phase_id);
    }
    for (int i = 0; i < 2; ++i)
    {
        vec4 phase_id = vec4(intBitsToFloat(i), 0, 0, 0);
        hs_fork_phase1(phase_id);
    }
    gl_TessLevelOuter[0] = hs_out[0].x;
    gl_TessLevelOuter[1] = hs_out[1].x;
    gl_TessLevelOuter[2] = hs_out[2].x;
    gl_TessLevelOuter[3] = hs_out[3].x;
    gl_TessLevelInner[0] = hs_out[4].x;
    gl_TessLevelInner[1] = hs_out[5].x;
}

wine-tess.trace: https://www.dropbox.com/s/fwxn446ncot1wzy/wine-tess.trace?dl=0
broken-shader-binary.bin: https://www.dropbox.com/s/hokkr4c0ut46t2i/broken-shader-binary.bin?dl=0

[1] - https://github.com/apitrace/apitrace

The bug is still present in 370.28

This is being tracked in NVIDIA bug 1826137