Good news! The latest 370.23 beta driver release (http://www.nvidia.com/Download/driverResults.aspx/105855/en-us) contains initial, experimental support for PRIME Synchronization! For reasons explained below the functionality can’t be officially supported yet, but if you’re brave enough, all the pieces are there to try it out.
I noticed that there is some confusion about what exactly PRIME is and how it works. In addition to explaining how to set up PRIME Synchronization, I’m taking this opportunity to clear up some questions and confusion. If you have any more questions, ask away.
What is PRIME?
PRIME is a collection of features in the Linux kernel, X server, and various drivers to enable GPU offloading with multi-GPU configurations under Linux. It was initially conceived to allow one GPU to display output rendered by another GPU, such as in laptops with both a discrete GPU and an integrated GPU (e.g., NVIDIA Optimus-enabled laptops).
Why is PRIME necessary?
When you imagine how Optimus works, you probably envision something like “GPU switching,” where there would be two GPUs and a hardware switch. The switch, or multiplexer (mux), would allow you to change which GPU drives the screen. When you start an intensive game, it would switch the display to the higher power discrete NVIDIA GPU and use it until you are done playing.
When it comes to GPU switching on Macs or older GPU switching PCs, you would be more or less correct, but modern Optimus-enabled PCs use something known as a “muxless” configuration: there is no switch. Instead, only the integrated GPU is connected to the display, and the NVIDIA GPU is floating, connected only to the system memory. Without a software solution, there would be no way to display the output from the NVIDIA GPU. The goal of PRIME is to allow the NVIDIA GPU to share its output with the integrated GPU so that it can be presented to the display.
Because PRIME requires the integrated and discrete GPUs to work in tandem to display the intended output, it cannot simply be a feature of any one driver, but the Linux graphics ecosystem as a whole.
How does PRIME work?
At a high level, features in the Linux kernel’s Direct Rendering Manager (https://en.wikipedia.org/wiki/Direct_Rendering_Manager#DMA_Buffer_Sharing_and_PRIME) enable drivers to exchange system memory buffers with each other in a vendor-neutral format. Userspace can leverage this functionality in a variety of ways to share rendering results between drivers and their respective GPUs.
The X server presents two methods for sharing rendering results between drivers: “output,” and “offload.” If you use the proprietary NVIDIA driver with PRIME, you’re probably most familiar with “output.”
“Output” allows you to use the discrete GPU as the sole source of rendering, just as it would be in a traditional desktop configuration. A screen-sized buffer is shared from the dGPU to the iGPU, and the iGPU does nothing but present it to the screen.
“Offload” attempts to mimic more closely the functionality of Optimus on Windows. Under normal operation, the iGPU renders everything, from the desktop to the applications. Specific 3D applications can be rendered on the dGPU, and shared to the iGPU for display. When no applications are being rendered on the dGPU, it may be powered off. NVIDIA has no plans to support PRIME render offload at this time.
How do I set up PRIME with the NVIDIA driver?
The best way to think about PRIME “output” mode is that it allows the iGPU’s displays to be configured as if they belonged to the dGPU.
With that in mind, there are a few steps that need to be taken to take advantage of this functionality:
- If you’re setting up PRIME on an Optimus laptop, there are likely no displays available on the dGPU. By default, the NVIDIA X driver will bail out if there are no displays available. In order to ensure that the X server can start with no heads connected directly to the dGPU, you must add ‘Option “AllowEmptyInitialConfiguration”’ to the “Screen” section of xorg.conf.
- By default, your xorg.conf will be set up such that only the NVIDIA driver will be loaded. In order to ensure that the iGPU’s heads are available for configuration, the ‘modesetting’ driver must be specified. See the README link below for an example.
- The X server must be told to configure iGPU displays using PRIME. This can be done using the ‘xrandr’ command line tool, via ‘xrandr –setprovideroutputsource modesetting NVIDIA-0’. If this fails, you can verify the available graphics devices using ‘xrandr –listproviders’.
More detailed instructions can be found in the README (Chapter 32. Offloading Graphics Display with RandR 1.4) .
Once PRIME is configured as described above, the iGPU’s heads may be configured as if they were the dGPU’s with any RandR-aware tool, be it ‘xrandr’ or a distro-provided graphical tool. The easiest way to get something to display is ‘xrandr –auto’, but more complicated configuration is possible as well.
Most likely, you’re going to want a script to do the xrandr/RandR configuration automatically on startup of the X server. If you’re using ‘startx’ this can be done simply by adding the xrandr commands to your .xinitrc. If you’re using a display manager such as LightDM, there are more specific instructions to get it to run the xrandr commands at startup. The Arch Linux Wiki (https://wiki.archlinux.org/index.php/NVIDIA_Optimus#Display_Managers) has a good section on the topic. If you’re using Ubuntu, Canonical provides a set of scripts enabled by the ‘nvidia-prime’ package that allow you to easily switch PRIME on and off using an added menu in ‘nvidia-settings’, but these scripts are neither provided nor officially supported by NVIDIA.
What is “PRIME Synchronization” about?
A lack of synchronization at a critical step in the pipeline has resulted in ugly artifacts under PRIME configurations for years, and I’ve been working to fix it. Let me explain:
Simplified Pipeline with OpenGL VSync
In a normal desktop configuration, games and other applications render into the GPU’s video memory, and the dGPU display engine pipes the result to the display, most commonly refreshing at 60 Hz. Without something commonly known as “vsync” to synchronize the application’s rendering to the screen’s refresh interval, you can’t be sure that the screen won’t refresh to half of one frame, and half of another. When this happens, you get an ugly artifact known as tearing (Screen tearing - Wikipedia). Fortunately, it’s a solved problem.
Simplified Pipeline with OpenGL VSync and PRIME Synchronization
With PRIME, there’s an extra step. Games and other applications continue to render into the dGPU’s video memory, but the final result needs to be placed into the shared buffer in system memory so that it can be scanned out by the iGPU’s display engine. Traditional vsync can synchronize the rendering of the application with the copy into system memory, but there needs to be an additional mechanism to synchronize the copy into system memory with the iGPU’s display engine. Such a mechanism would have to involve communication between the dGPU’s and the iGPU’s drivers, unlike traditional vsync.
Up until recently, the Linux kernel and X server lacked the required functionality to allow the dGPU and iGPU drivers to communicate and synchronize the copy with the scanout. Because of this limitation, there was virtually nothing any one driver could do to provide the necessary synchronization; it required improvements to the greater ecosystem.
Over the past many months, I’ve been working to implement and upstream the necessary improvements to the X server and iGPU kernel and userspace drivers so that we could leverage them from within our driver. Finally, they have landed (PRIME Synchronization & Double Buffering Land In The X.Org Server - Phoronix). Unfortunately, the changes required breaking the binary interface (ABI) between the X server and its drivers, so it may be a while before it propagates to mainstream distros.
How does PRIME Synchronization work?
It’s not much different than vsync. Rather than sharing just one screen-sized buffer from the dGPU driver to the iGPU driver, we share two (https://en.wikipedia.org/wiki/Multiple_buffering#Double_buffering_in_computer_graphics). The iGPU driver asks for the dGPU driver to copy the dGPU’s current X screen contents into the iGPU’s buffer that is hidden from view. When the copy operation completes, the iGPU flips to display the updated buffer. When the iGPU driver notices that the screen has refreshed and the updated buffer is being displayed, it starts the whole process again with the now-hidden buffer. This way, we can ensure that dGPU driver never has to copy into a buffer that is currently being displayed, eliminating the chance for tearing.
Of course nothing is ever that simple to implement in practice, but conceptually it’s nothing new.
How do I set up PRIME Synchronization?
If all of the requirements for PRIME Synchronization are fulfilled, it is enabled automatically.
To support PRIME Synchronization, the system needs:
- Linux kernel 4.5 or higher
- An X server with ABI 23 or higher (as yet officially unreleased, use commit 2a79be9)
- Compatible drivers
The “modesetting” driver tracks the X server, so the driver shipped with an X server of ABI 23 or higher will be compatible when run against Intel iGPUs.
The latest 370.23 beta driver release (http://www.nvidia.com/Download/driverResults.aspx/105855/en-us) contains an initial implementation of PRIME Synchronization. Because the X ABI has yet to be frozen, however, it is subject to change. Any change to the ABI will break compatibility with the NVIDIA driver, so we cannot officially support the new functionality until the ABI is frozen. If you wish to test the new features despite them being experimental, the latest r370 driver release supports X servers built from Git commit 2a79be9. X servers with video driver ABI 23 built from other commits may or may not work. It’s best to stick to the commit that the driver has been built against.
To start X with an unofficially supported ABI (ABI 23 included), add the following section to your xorg.conf:
Option “IgnoreABI” “1”
The NVIDIA driver’s PRIME Synchronization support relies on DRM-KMS, which is disabled by default due to its current incompatibility with SLI. To enable it, run ‘sudo rmmod nvidia-drm; sudo modprobe nvidia-drm modeset=1’. In other words, load the nvidia-drm module with the parameter modeset=1.
If the above requirements are fulfilled, PRIME Synchronization should be enabled. The functionality is still experimental, so it’s possible that there may be kinks to work out.
If PRIME Synchronization is enabled, OpenGL applications can synchronize to the iGPU’s heads as they would with a normal dGPU head, and the names of PRIME heads can be specified via __GL_SYNC_DISPLAY_DEVICE.
If for whatever reason you have support for PRIME Synchronization but wish to disable it, you may disable it via ‘xrandr –output --set “PRIME Synchronization” 0’ and re-enable it via ‘xrandr –output --set “PRIME Synchronization” 1’.