Problems finding an explanation for an additional frame of latency in setup with a Quadro Sync

Hello,

we are currently testing if it’s possible to reduce the latency (measured from user input to start of the scanout of the rendered frame). In our case we expect the latency to be not greater than a single frame.

This how our setup looks like:

We have hardware component that triggers the rendering of a frame and the Quadro Sync. The output is a custom hardware unit that can register the scanout of single lines. We measure the latency from the triggering of the frame rendering up to the scanout of the first line.

The application which renders the frames is configured to run with one frame latency. It uses D3D12 as the rendering API with the DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT feature. The application is run in exclusive fullscreen mode, preventing the DWM from buffering and compositing frames. All this was verified with Nsight system. We ensured that the application renders frames within the vsync interval.

The problem is that we measure a latency of two frames. We fail to find an explanation for one additional frame of latency that is added between the GPU and the output unit. This one frame of latency seems to disappear (or decrease somewhat) when running the test setup where the vsync interval is halved (i.e. rendering runs with frequency F and Quadro sync with F*2).

Do you have an idea what could cause the additional frame of latency? Is there some buffering happening on the Quadro Sync? Can it be solved by adjusting the Quadro Sync/Driver config?

Here is some data of our test system:

  • Windows 10.0.19044 Build 19044
  • GPU Nvidia RTX A 5000
  • GPU Driver 537.70 (default configuration)
  • Quadro Sync II Firmware Version 2.02

Best regards,

Ilja