Unexpectedly slower performance (-25%) when replacing VK_FORMAT_B8G8R8A8_UNORM with VK_FORMAT_R8G8B8A8_UNORM

I switched from BGRA8 to RGBA8 to ease development of some complex format conversions that we needed. However, I was surprised to observe a performance slowdown after the switch that is only present on NVIDIA GPUs. Note that before this change, the code emulated RGBA8 formats using BGRA8 with imageview swizzles so I expected the driver to at least beat my hackish implementation of RGBA8.