Our application use big render targets that are only partially filled. We load them with VK_ATTACHMENT_LOAD_OP_DONT_CARE and manually clear the used regions with a quad.
We are observing a cost proportional to the dimensions of the render target when beginning the render pass (vkCmdBeginRenderPass). So even if we only use a region of 100x100, this costs us more in a render target of 2000x2000 than in one of 1000x1000 (it costs 4x more milliseconds).
This is not happening when using D3D11 or D3D12.
This performance cost is not expected. Things get even worse when using multisample. Are we doing something fundamentally wrong?
Hello @jesus.de.santos and welcome to the NVIDIA developer forums!
I am not familiar enough with Vulkan to be able to answer this, but I will try and bring this to the attention of some experts.
Did you double check with the Vulkan spec that there is no caveat on using render targets, that it maybe mentions that they will automatically initialized or cleared on beginning of the render pass? Again, this is me guessing, but maybe it gives you a “line of attack” while waiting for more feedback.
you can just attach the executable or compressed package here. Please make it as compact as possible and independent of unrelated 3rd party tools as long as it reproduces the issue. Thanks
The FAST example is rendering 2 quads to an offscreen and the using that texture.
The SLOW example is exactly the same, but using an offscreen of 16384x16384 with Load set to DONT_CARE.
It seems this was related to using DONT_CARE and transitioning layout from VK_IMAGE_LAYOUT_UNDEFINED. This combination was always clearing the whole surface.