Possible cache invalidation bug in KHR_fragment_shading_rate attachments

Hi,

This is a somewhat complex scenario so I’ll try to describe it the best I can.

I have a frame with multiple render passes. 3 of those passes use VRS (aka KHR_fragment_shading_rate) and more precisely they are using shading rate attachments. So the frame looks somewhat like this:

  • Random work
  • Pass A (Uses shading rate attachment #1)
  • Random work
  • Pass B (Uses shading rate attachment #2)
  • Random work
  • Pass C (Uses shading rate attachment #1 again)
  • Random work

What I observe is that Pass C uses a shading rate attachment that is a combination of attachment #1 and #2. In the image above the gl_ShadingRateEXT is used to visualize the rate picked up by Pass C. The shading rate is the expected except of the bottom left quarter of the screen that has a stripe pattern. This stripe pattern is the shading rate of the attachment #2.

Some notes:

  • Validation is clean
  • The attachment #2 is 1/4 of the resolution of #1 (that’s why only one quarter of the screen is wrong)
  • Using nsight confirms that the attachment #1 and #2 have the correct values. Also pass C is using the correct attachment
  • Adding an execution barrier (ALL_COMMANDS to ALL_COMMANDS) before C doesn’t help. But if in this barrier there is a VkMemoryBarrier with src and dst access masks to VK_ACCESS_FLAG_BITS_MAX_ENUM then the issue goes away. This most likely means a cache related problem

Let’s go into more detail on how the frame looks like then (I’ll omit the random work that happens in between):

  • Pipeline barrier that transitions #1 from compute dispatch write to VRS attachment read (#1 is populated in the prev frame)
  • Pass A (Uses shading rate attachment #1)
  • Pipeline barrier that transitions #2 from undefined to storage image write
  • Compute dispatch that populates #2
  • Pipeline barrier that transitions #2 from storage image write to VRS attachment read
  • Pass B (Uses shading rate attachment #2)
  • Pass C (Uses shading rate attachment #1 again)
  • Pipeline barrier that transitions #1 from VRS attachment read to storage image write
  • Compute dispatch that populates #1

The barriers appear correct to me. With all the correct execution units and memory accesses. But maybe I’m wrong.

System specs:

  • Driver: 510.60.02
  • GPU: nVidia RTX 2070
  • OS: Ubuntu 21.10

Also tested on:

  • Driver: 512.59
  • GPU: nVidia RTX 2080 Super
  • OS: Windows 10

If you want a reproducer please let me know so I can share how to build GitHub - godlikepanos/anki-3d-engine: AnKi 3D Engine - Vulkan backend, modern renderer, scripting, physics and more and what patch to apply

Hi there @godlike_panos and welcome to the NVIDIA developer forums.

Thank you for bringing up this issue. From the information here I am unable to confirm if this is a bug or not, so I will ask some experts to take a look.

Get back to you soon!

Thanks for the response. I’ll share some instructions on how to build and test the affected application on Linux if you want. If you want pre-build binaries (Windows or Linux) please let me know. The problem will need some seconds to manifest (the initial frames have more passes and for some reason the problem doesn’t trigger there).

git clone https://github.com/godlikepanos/anki-3d-engine.git anki
cd anki
git checkout -b gi_vrs origin/gi_vrs
# Apply the vrs_debug.diff patch attached in this reply 
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j
cd ..
# Run the application, wait for a few seconds for shaders to compile and then wait 
# some more for the issue to manifest
./build/Bin/Sponza WindowFullscreen 1 RPreferCompute 0 GrVrs 1

vrs_debug.diff (1.6 KB)