vkCmdFillBuffer bugged on ada

GPU: ADA, RTX4080, DRIVER: 31.0.15.2802
CPU: AMD Ryzen 7 7800X3D 8-Core Processor 4.20 GHz
WINDOWS VERSION: 22621.1992 (Windows 11 Home)

I am encountering a bug in the Vulkan API when using the command vkCmdFillBuffer.
It does not matter what featues are enabled or what API version i request this behavior is consistent.
Scenario:
I give an offset and a size to the fill, offset + size is equal to the total buffer size.
Expected behavior:
Buffer is filled from offset up to offset + size
Actual behavior:
Buffer is filled from 0 (no matter the offset given) up to offset + size

This only happens when offset + size = buffersize.

I can reproduce this on a 1080ti on driver 537.13. Exact same situation and behavior. Additionally, this happens with VK_WHOLE_SIZE as the size.

I can provide a full repro if needed, but the easiest way to see this is through a renderdoc capture. Captured on my 1080ti on windows 11. EID 202 is a clear buffer with offset 8 and size 8192 on a buffer of size 8204. If you view the buffer contents before and after, you see the data written to the buffer in EID 134 disappears, as it is cleared.

Note to self: rend3 cbf091dad1f4302bced6e99c47fd90a84cb6cad8 object::multi_frame_add repros this.

nvidia-bug.rdc (75.2 KB)

Hello and welcome to the NVIDIA developer forums @patrick.ahrens.32369 and @cwfitzgerald.

Thank you for bringing this to our attention, I will file an internal issue for this.

And thank you for the renderdoc capture, I hope that will speed up debugging.

That’s great! Let me know if you need any more information!

1 Like

In trying to find a workaround for this issue within my own codebase, I think I have narrowed down the bug precondition. If the following code returns true the vkCmdFillBuffer will work as expected. If the code returns false the 1. clear window will be “slid” down until the offset is a multiple of 16. The size of the buffer does not matter, so long as it is big enough.

let succeeds = 'b: {
    if copy_length >= 4096 {
        if start_offset % 16 != 0 {
            if copy_length == 4096 {
                break 'b true;
            }
            if copy_length % 16 == 0 {
                break 'b false;
            }
        }
    }
    true
};

So a clear of 0…4096 is fine, as the start offset is a multiple of 16. A clear of 4…4100 is fine as the copy length is exactly 4096. A clear of 4…4104 is fine, as the copy length (4100) isn’t a multiple of 16. 4…4116 is a problem, as the start offset (4) is not a multiple of 16, and the copy length is a multiple of 16 (4112). Instead of 4…4116 being cleared, 0…4112 will be cleared (slid down to the previous multiple of 16).

I hope this is at least marginally helpful - I may have missed some conditions, but it seems the answer will be splitting clears.

1 Like

Thanks!

I added that to the bug description and the bug is in progress.

Thanks for reporting the bug and all the details. This driver bug should be fixed in our latest Vulkan beta drivers 537.54 for Windows and 535.43.10 for Linux found here: https://developer.nvidia.com/vulkan-driver. Let us know how it works out for you. Thanks again.

1 Like

Great to hear! I’m not entirely sure how the nvidia driver versioning scheme works, is there a single value I can compare the VkPhysicalDeviceProperties::driverInfo value against to see if the bug is fixed (and disable our internal workaround)?

Unfortunately it’s a bit tricky because the version numbers aren’t linear, especially for Windows, and the presence of the special Vulkan beta branch also disrupts the version sequence. The fix will be generally available to the public with the next driver release family, so checking for “major” version of 550 or greater from the “major.minor” in VkPhysicalDeviceDriverProperties.driverInfo should be safe. The “major” value is also packed in ((VkPhysicalDeviceProperties.driverVersion >> 22) & 0x3FF).

Amazing, thank you so much!