Calling NvApi_D3D1x_Present on A6000 is low performance

Hi all,

I’m developing a cluster rendering software and want to synchronize all outputs with so call swap group/swap barrier NVAPI. (NVAPI R275-NDA-developer)

In my scenario, two graphics workstations, with each an A6000 and a Quadro sync II are equipped, are connected with daisy chain mode. Through Nvidia Control Panel, all screens attached to the workstations are configured to synchronize properly. Everything seems to be good so far and render output on all screens synchronized nicely as expected in full-screen mode.

But I noticed that when calling NvApi_D3D1x_Present instead of IDXGID3DSwapChain::present (my program is written in DirectX11), fps DROPs almost to a half. For example, my program may run at 22~24 fps with IDXGID3DSwapChain::present at 4k, while only 11~12 fps with NvApi_D3D1x_Present. When it switches back to windowed mode, both would run at about 22~24fps, but out of sync. FPS would drop to a half even with the daisy chain not connected, just in full-screen mode.

I thought it might be a driver bug and upgraded the driver to the latest version 496.49(NFE), but without luck.

The program is tested on another workstation(equipped with a RTX A5000 but without Quadro sync II card),fps seemed to be nice (the same) both in windowed or full-screen mode, arount 17~18fps.

I googled around but found nothing about this.

Any Ideas? Thank you in advance!

Here is some extra information about the workstations.
OS: windows 10 20H2,
CPU: AMD Ryzen 7 5800x 8-core @ 3.8G
MotherBoard: Gigabyte x570 AORUS MASTER
RAM: G.SKILL128G DDR4 2666 (32G X 4)

Hi all, I’m still waiting for the solution. If anybody who knows something about the problem, would you please provide some hints? Any suggestion would be appreciated. Thank you in advance.