Nsight Systems is great. I see the queuing of Swap on the CPU queue, the execution of swap on the GPU HW queue, and I see VBlank (VSync) clock ticks for the connected display (relevant when rendering with flip mode – e.g. fullscreen borderless with focus).
However, one thing I’m missing is…
Q: How do I determine, for a specific image “A” produced by a specific Swap execution on the GPU HW queue, at which VSync (VBlank) clock tick it will first be displayed (swap chain behavior).
Q: How do I see which image is being displayed (scanned out) between 2 VSync clock pulses?
Not only would this provide a clear picture of end-to-end latency, it also makes it trivial to observe what PresentMon calls
MsBetweenDisplayChange (time an image is actively being displayed), which is > 1 VSync interval if the CPU+GPU blows their time budget.