I am currently measuring FPS and frame time in a scene with about 80k vertices implementing soft shadow mapping.
The frame time is measuring with the GL_ARB_timer_query extension while FPS are measured with a CPU timer.
During these measurements I came up with some interesting results that I do not fully understand so far:
Single GPU: ~0.3ms GPU Time, ~2500FPS
SLI (force AFR 1): ~0,8ms, ~1200FPS
Single GPU: ~2,2ms GPU TIme, ~ 600FPS
SLI (force AFR 1): ~1,4ms, ~900FPS
The issues should not be related to RTT, I am using a FBO and I requested WGL_SWAP_EXCHANGE_ARB swap mode in the pixelformat. WGL_SWAP_COPY_ARB also results in bad performance when using a higher resolution and fullscreen.
Unfortunately NVIDIA Nsight crashed when i tried to capture GPU frames when AFR 1 is enabled. I did however made two logs with GPUView that give me an idea what is happening but I would need some further elobaration on this. Following sceenshots show a comparison of 800x800 windowed vs 2560x1600 fullscreen, both with SLI enabled (force AFR 1).
I read the following on the web: When SLI is enabled, the NVIDIA driver must coordinate the operations of both GPUs when each new frame is swapped (made visible). For most applications, this GPU synchronization overhead is negligible. However, because xxx renders so many frames per second, the GPU synchronization overhead consumes a significant portion of the total time, and the framerate is reduced.
Is this somehow the case here? If so could someone eloborate a bit more of what exactly happens.
HOWEVER if I render in 2560x1600 in windowed mode the performance is slightly WORSE - SAME with SLI again compared to using a single GPU. I am kinda confused :D
Thanks for any suggestions in advance.