Artifacts when resolving MSAA depth buffer and reading in later render pass

OS: Arch Linux
Driver version: 525.89.02
GPU: GeForce GTX 970

I recently added MSAA support to my renderer that uses dynamic rendering. I need to render in two passes:

  1. Main render pass. MSAA toggleable on/off.
  2. Second render pass. No MSAA.
    • Writes to color buffer from previous pass.
    • Reads depth buffer from previous pass to:
      • Do depth tests.
      • For sampling in fragment shaders (to implement e.g. soft particles).

Everything works as expected on Intel and AMD GPUs I have available for testing. With MSAA enabled and disabled.

However, on the Nvidia GPU I am testing on everything works as expected when MSAA is disabled but not when it is enabled. It seems like depth testing does not work reliably in the second render pass. E.g. soft particles mostly do not show up. Depending on resolution they are visible in some parts of the screen and sometimes with a checkerboard like pattern like in the following image:

Disabling the depth testing in the soft particle pipelines make the particles always show up. Sampling depth buffer in fragment shader seems to always work. As can be seen in the screenshot, I can also sample the resolved depth buffer OK when drawing it in a debug overlay in a later render pass.

I do not get any warnings from the validation layer with VK_VALIDATION_FEATURE_ENABLE_SYNCHRONIZATION_VALIDATION_EXT enabled. Not on any of the GPU:s regardless of if MSAA is enabled or not. So I am not sure what is wrong. Is the validation layer not validating barriers correctly in this case? Or is there maybe a bug in the Nvidia driver?

Below is how I setup my render passes with dynamic rendering (copied from renderer and modified to make sense “standalone”, hopefully there were no mistakes when I extraced it):

/////////////////////////////

// Main render images and views. VK_SAMPLE_COUNT_1_BIT if MSAA disabled. VK_SAMPLE_COUNT_X_BIT, X > 1, if MSAA enabled.
VkImage     color_image = ...;
VkImageView color_image_view = ...;
VkImage     depth_image = ...;
VkImageView depth_image_view = ...;

// Resolved images and views. Always VK_SAMPLE_COUNT_1_BIT sample. VK_NULL_HANDLE if MSAA disabled.
VkImage     color_resolved_image = ...;
VkImageView color_resolved_image_view = ...;
VkImage     depth_resolved_image = ...;
VkImageView depth_resolved_image_view = ...;

////////////////////////////// Main render pass start...
{
    VkImageMemoryBarrier color_barriers[2];
    u32 color_barriers_count = 1;
    color_barriers[0] = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
    color_barriers[0].srcAccessMask = VK_ACCESS_MEMORY_READ_BIT;
    color_barriers[0].dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    color_barriers[0].oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
    color_barriers[0].newLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    color_barriers[0].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barriers[0].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barriers[0].image = color_image;
    color_barriers[0].subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1};
    if (color_resolved_image) {
        color_barriers[1] = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
        color_barriers[1].srcAccessMask = VK_ACCESS_MEMORY_READ_BIT;
        color_barriers[1].dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
        color_barriers[1].oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
        color_barriers[1].newLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
        color_barriers[1].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        color_barriers[1].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        color_barriers[1].image = color_resolved_image;
        color_barriers[1].subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1};
        color_barriers_count++;
    }

    vkCmdPipelineBarrier(command_buffer,
                         VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
                         VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
                         VK_DEPENDENCY_BY_REGION_BIT,
                         0,
                         nullptr,
                         0,
                         nullptr,
                         color_barriers_count,
                         color_barriers);

    VkImageMemoryBarrier depth_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
    depth_barrier.srcAccessMask = VK_ACCESS_SHADER_READ_BIT;
    depth_barrier.dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
    depth_barrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
    depth_barrier.newLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL;
    depth_barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    depth_barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    depth_barrier.image = depth_image;
    depth_barrier.subresourceRange = {VK_IMAGE_ASPECT_DEPTH_BIT, 0, 1, 0, 1};

    vkCmdPipelineBarrier(command_buffer,
                         VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
                         VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT,
                         VK_DEPENDENCY_BY_REGION_BIT,
                         0,
                         nullptr,
                         0,
                         nullptr,
                         1,
                         &depth_barrier);

    if (depth_resolved_image) {
        VkImageMemoryBarrier depth_resolve_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
        depth_resolve_barrier.srcAccessMask = VK_ACCESS_SHADER_READ_BIT;
        depth_resolve_barrier.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
        depth_resolve_barrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
        depth_resolve_barrier.newLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL;
        depth_resolve_barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        depth_resolve_barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        depth_resolve_barrier.image = depth_resolved_image;
        depth_resolve_barrier.subresourceRange = {VK_IMAGE_ASPECT_DEPTH_BIT, 0, 1, 0, 1};

        vkCmdPipelineBarrier(command_buffer,
                             VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
                             VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
                             VK_DEPENDENCY_BY_REGION_BIT,
                             0,
                             nullptr,
                             0,
                             nullptr,
                             1,
                             &depth_resolve_barrier);
    }

    VkRenderingAttachmentInfo color_attachment = {.sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO};
    color_attachment.imageView = color_image_view;
    color_attachment.imageLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    if (color_resolved_image) {
        color_attachment.resolveMode = VK_RESOLVE_MODE_AVERAGE_BIT;
        color_attachment.resolveImageView = color_resolved_image_view;
        color_attachment.resolveImageLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    }
    color_attachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
    color_attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
    color_attachment.clearValue = {0.0f, 0.0f, 0.0f, 1.0f};

    VkRenderingAttachmentInfo depth_attachment = {.sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO};
    depth_attachment.imageView = depth_image_view;
    depth_attachment.imageLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL;
    if (depth_resolved_image) {
        depth_attachment.resolveMode = VK_RESOLVE_MODE_MIN_BIT;
        depth_attachment.resolveImageView = depth_resolved_image_view;
        depth_attachment.resolveImageLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL;
    }
    depth_attachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
    depth_attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
    depth_attachment.clearValue.depthStencil = {1.0f, 0};

    VkRenderingInfo rendering_info = {.sType = VK_STRUCTURE_TYPE_RENDERING_INFO};
    rendering_info.renderArea.extent = {width, height};
    rendering_info.layerCount = 1;
    rendering_info.colorAttachmentCount = 1;
    rendering_info.pColorAttachments = &color_attachment;
    rendering_info.pDepthAttachment = &depth_attachment;

    vkCmdBeginRendering(command_buffer, &rendering_info);
}
////////////////////////////// Main render pass rendering...
{
    vkCmdEndRendering(command_buffer);

    VkImageMemoryBarrier color_barriers[2];
    u32 color_barriers_count = 1;
    color_barriers[0] = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
    color_barriers[0].srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    color_barriers[0].dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;
    color_barriers[0].oldLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    color_barriers[0].newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
    color_barriers[0].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barriers[0].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barriers[0].image = color_image;
    color_barriers[0].subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1};
    if (color_resolved_image) {
        color_barriers[1] = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
        color_barriers[1].srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
        color_barriers[1].dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;
        color_barriers[1].oldLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
        color_barriers[1].newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
        color_barriers[1].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        color_barriers[1].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        color_barriers[1].image = color_resolved_image;
        color_barriers[1].subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1};
        color_barriers_count++;
    }

    vkCmdPipelineBarrier(command_buffer,
                         VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
                         VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT | VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
                         VK_DEPENDENCY_BY_REGION_BIT,
                         0,
                         nullptr,
                         0,
                         nullptr,
                         color_barriers_count,
                         color_barriers);

    VkImageMemoryBarrier depth_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
    depth_barrier.srcAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
    depth_barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
    depth_barrier.oldLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL;
    depth_barrier.newLayout = VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL;
    depth_barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    depth_barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    depth_barrier.image = depth_image;
    depth_barrier.subresourceRange = {VK_IMAGE_ASPECT_DEPTH_BIT, 0, 1, 0, 1};

    vkCmdPipelineBarrier(command_buffer,
                         VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT,
                         VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
                         VK_DEPENDENCY_BY_REGION_BIT,
                         0,
                         nullptr,
                         0,
                         nullptr,
                         1,
                         &depth_barrier);

    if (depth_resolved_image) {
        VkImageMemoryBarrier depth_resolve_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
        depth_resolve_barrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
        depth_resolve_barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
        depth_resolve_barrier.oldLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL;
        depth_resolve_barrier.newLayout = VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL;
        depth_resolve_barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        depth_resolve_barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
        depth_resolve_barrier.image = depth_resolved_image;
        depth_resolve_barrier.subresourceRange = {VK_IMAGE_ASPECT_DEPTH_BIT, 0, 1, 0, 1};

        vkCmdPipelineBarrier(command_buffer,
                             VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
                             VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
                             VK_DEPENDENCY_BY_REGION_BIT,
                             0,
                             nullptr,
                             0,
                             nullptr,
                             1,
                             &depth_resolve_barrier);
    }
}
////////////////////////////// Main render pass done. Start second render pass that only reads depth buffer from main render pass.
{
    VkImageMemoryBarrier color_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
    color_barrier.srcAccessMask = VK_ACCESS_MEMORY_READ_BIT;
    color_barrier.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    color_barrier.oldLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
    color_barrier.newLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    color_barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barrier.image = color_resolved_image ? color_resolved_image : color_image;
    color_barrier.subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1};

    vkCmdPipelineBarrier(command_buffer,
                         VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
                         VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
                         VK_DEPENDENCY_BY_REGION_BIT,
                         0,
                         nullptr,
                         0,
                         nullptr,
                         1,
                         &color_barrier);

    VkRenderingAttachmentInfo color_attachment = {.sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO};
    color_attachment.imageView = color_resolved_image ? color_resolved_image_view : color_image_view;
    color_attachment.imageLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    color_attachment.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
    color_attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;

    VkRenderingAttachmentInfo depth_attachment = {.sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO};
    depth_attachment.imageView = depth_resolved_image ? depth_resolved_image_view : depth_image_view;
    depth_attachment.imageLayout = VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL;
    depth_attachment.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
    depth_attachment.storeOp = VK_ATTACHMENT_STORE_OP_NONE;

    VkRenderingInfo rendering_info = {.sType = VK_STRUCTURE_TYPE_RENDERING_INFO};
    rendering_info.renderArea.extent = {width, height};
    rendering_info.layerCount = 1;
    rendering_info.colorAttachmentCount = 1;
    rendering_info.pColorAttachments = &color_attachment;
    rendering_info.pDepthAttachment = &depth_attachment;

    vkCmdBeginRendering(command_buffer, &rendering_info);
}
////////////////////////////// Second render pass rendering...
{
    vkCmdEndRendering(command_buffer);

    VkImageMemoryBarrier color_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER};
    color_barrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    color_barrier.dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;
    color_barrier.oldLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    color_barrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
    color_barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    color_barrier.image = color_resolved_image ? color_resolved_image : color_image;
    color_barrier.subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1};

    vkCmdPipelineBarrier(command_buffer,
                         VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
                         VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT | VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
                         VK_DEPENDENCY_BY_REGION_BIT,
                         0,
                         nullptr,
                         0,
                         nullptr,
                         1,
                         &color_barrier);
}
////////////////////////////// Second render pass done.

Another thing I am a bit confused about is that when using VkRenderPass, I had dependencies setup to handle SYNC-HAZARD-WRITE_AFTER_WRITE as described in Synchronization Examples · KhronosGroup/Vulkan-Docs Wiki · GitHub . Also had similar dependencies for SYNC-HAZARD-READ_AFTER_WRITE to handle depth tests and reading of depth buffer in secondary render pass. When migrating to dynamic rendering I could not see how to translate this to image barriers. But since everything seemed to work without any warnings from the validation layer, I hoped things were still OK. Everything worked as expected until I added MSAA and tried it on the Nvidia GPU. Not sure if it is relevant but… maybe it is?

Originally posted over at Khronos Discourse forum:

But figured I should post here as well since it may be a Nvidia specific problem.

Updated to 530.41.03. Problem still persists.

Hello @Martin4096 and welcome to the NVIDIA developer forums!

I am not able to help with this myself, but I will try to bring this to the attention of some of our Vulkan experts.

Thanks!

Hello @MarkusHoHo,

Any update on this? Any help would be much appreciated.

Hi Martin4096,

unfortunately no, I am sorry. I will reach out again internally.

@Martin4096 Hi, could you provide a minimal simple application with Visual Studio solution(it should compile and run)? It could help us to fix problem.

I do not have a machine with Windows and a Nvidia GPU available. Not even sure it would reproduce in Windows. Could try to create a minimal example. Would require… some work though.

That being said, I tried on Linux machine with Nvidia GPU again today and am unable to reproduce the issue. Driver version has been updated to 535.54.03. Did something change in driver recently that could explain why artifacts are no longer visible? Hard to know, I guess, if it is a “race” of some form. Or?

@AndreyOGL_D3D You do not spot anything wrong with barriers in code I pasted in original post? I have not been able to find any code online that uses dynamic rendering and also resolves depth buffer and samples it in later render pass. But as mentioned, I do not get any warnings from the Vulkan validation layer. But who knows if it checks barriers correctly in this case.

Did you setup validation layers using vkconfig? Try to set “synchronization” check box, run application, check error messages from validation layers.

No, I have never used vkconfig. I enable it by setting VK_VALIDATION_FEATURE_ENABLE_SYNCHRONIZATION_VALIDATION_EXT in code when initializing Vulkan.

And yes, it works. :) I get errors when I setup barriers wrong. Here and elsewhere. The validation layer may not catch all errors though, I am thinking.

But I also do not see what I could be doing wrong in the pasted code, if anything. Since it is gone with 535.54.03 it may just have been a Nvidia driver bug after all… ?