It’s unclear to me what you mean when you refer to “dGPU’s buffers” and “iGPU’s buffers.” All of the buffers in this swap chain originate from the dGPU driver, although the buffers that the iGPU flips between for PRIME Synchronization are allocated in system memory in a format that the iGPU can understand.
There is an intermediate composition from the X screen’s primary surface into an intermediate video memory buffer (similarly to ForceFullCompositionPipeline = On) before asynchronously copying from that into the requested system memory back buffer, but that won’t add any additional latency because it all completes before the iGPU’s next vblank. This is done for performance reasons, as the composition step is done in the 3D channel, and we don’t want 3D applications to be blocked behind a relatively slow copy into system memory as they would if we composited directly into the system memory back buffer. Fermi GPUs lack this asynchronous copy support, so they composite directly into system memory at the expense of 3D performance for lack of a better option.
One issue may arise from the fact that OpenGL swaps after each composition, potentially adding more latency into the swap chain.
If you want to minimize input lag as much as possible today, your best bet is to set __GL_SYNC_TO_VBLANK=0. Naturally, the application won’t sync to vblank, but due to an implementation detail in PRIME, will not tear. Composition is essentially atomic, and incomplete frames will drop rather than tear. Under PRIME, GL Sync to VBlank has more to do with throttling the application to vblank while maintaining smoothness than it does with avoiding tearing.
Disabling GL Sync to VBlank should eliminate the potential additional input lag from the GL->PRIME->iGPU swapchain while maintaining tear-free, at the expense of power at framerates much higher than the refresh rate, or at the expense of smoothness at framerates closer to the refresh rate.
Thanks for the explanations! The way the copies happen is clearer to me now, I didn’t think of different buffer modes and the slow copy to system memory.
Are there any predictions as to when the 1.19 ABI freeze happens?
First of all, thank you very much for making this beta driver available.
After lots of testing and debugging I finally made this work. So here are my findings to help anyone trying this too.
This won’t work without glamor compiled into the xserver.
Without glamor (i.e. compiled in, not used), no framebuffer/pixmap will be created and the modesetting driver will die saying ‘No space left on device’
This won’t work with the newer ‘Load module modesetting Section’ in xorg.conf
With just specifying
The modesetting driver will grab all devices and the nvidia driver gets none.
It only works if modesetting Device/Screen sections in xorg.conf are defined before nvidia Device/Screen sections.
Otherwise, the modesetting driver just gets initialised and then unloaded.
Glamor has to be disabled.
As it was before, use AccelMethod none.
PPS:
compiled packages:
libdrm-2.4.70 <–probably not needed but doesn’t hurt either
libXfont2-2.0.1
xf86-video-intel-2.99.917_p20160812 <–just needed to switch back to plain iGPU
xf86-input-evdev-2.10.3
xorg-server-1.18.99.1_p2a79be9
It only works if modesetting Device/Screen sections in xorg.conf are defined before nvidia Device/Screen sections.
Otherwise, the modesetting driver just gets initialised and then unloaded.
This is the tip that made my PRIME work after hours upon hours of changing configs!
Thanks
@YStar
Please post your configure options and your gcc version. I’ll post mine when I find some time. Will have to extract them from the Gentoo ebuild I made.
Regarding power management, I noticed that, in its current form, the NVIDIA driver won’t actually turn the dGPU off when it’s set to use the iGPU. Instead, it leaves the dGPU on but idling while the iGPU does the work.
When testing on my Vostro 5470 with an onboard GT 740M, it consumed around 10W when idling with the standard PRIME solution while it consumed just about 6W when using bbswitch to turn the dGPU off (after unloading the NVIDIA modules). I wonder why the default configuration doesn’t do that, since it proved to be quite reliable for me to automatically disable the dGPU when using the iGPU, which almost doubled my battery life in common usage cenarios (such as surfing the internet and programming). It hasn’t caused any issues with suspend and hibernation and I even managed to turn the dGPU back on to run CUDA programs, etc.
New 375 beta is out, does it have prime sync capabilities? If so what xorg commit is it built on?
Having some minor regressions with 270.28 over .23 I’d like to try that.
Yes, it and all drivers going forward should have PRIME Sync capabilities. As of this release it’s still built against commit 2a79be9, as shown here: Chapter 32. Offloading Graphics Display with RandR 1.4 . If and when the commit changes, it will be reflected on that page in the respective README.
configuration (hw)
iGPU (intel)
dGPU (nvidia)
two monitors attached to each card
why do i need PRIME?
to run kmscon. kmscon doesn’t run on nvidia blob but it does on intel’s.
what is the problem now?
kmscon runs on tty1-5 and X on tty6 with PRIME synchronisation. problem is that monitors are mirrored. i have tried TwinView but it doesn’t seem to work. here is my xorg.conf: