PRIME render offloading on Nvidia Optimus

As per aplattner’s suggestion from https://devtalk.nvidia.com/default/topic/957814/linux/prime-and-prime-synchronization/post/4953175/#4953175, I’ve taken the liberty to open a new thread for discussing on PRIME GPU render offload feature on Optimus-based hardware. As you may know, Nvidia’s current official support only allows GPU “Output” instead GPU “Offload” may be unsatisfactory as it translates into higher power consumption and heat production in laptops.

I did suggest in the PRIME and PRIME Sync thread on how PRIME render offload on Nvidia Linux could achieve feature parity with Optimus on Windows if possible.

In response, aplattner mentions the following :

It would be really helpful for us if anyone from Nvidia at the very least inform how much work is needed should Nvidia plans to implement PRIME render offload and what are the needed components to attain render offload functionality on Linux with Optimus hardware.

Thank you in advance,
liahkim112

I agree. Please don’t leave us in the dark about this.

It seems difficult to clearly make the point to NVIDIA developers: this is the single biggest defect in the NVIDIA driver that prevents NVIDIA cards from being usable on Linux. Most folks have laptops these days. People are starving for render offload, and it’s simply absurd that it still has not been completed. The fact that there isn’t a timeline for this is quite disheartening. Not many users find their way to this forum, which is unfortunate, but believe this: people want this functionality, and they want it bad.

Please, get it together, and implement this. The year is 2016. It’s starting to become ridiculous.

I agree. Also there is big difference between running game in the current situation and prime render offload. Compiz (for instance) always use the dGPU to render everything so there is FPS drop in games.

Not to mention that nouveau does this perfectly.

By implementing this functionality, we could also fix Optimus completely.

The driver could then have profiles for games and applications similiar to Windows, to render them on the NVIDIA GPU automatically.

For me, PRIME offloading is a “nice to have” kind of thing but not really a pressing need. Personally, I’ll probably just use PRIME to always use the nvidia GPU once PRIME sync is fixed for the following reasons:

  • nvidia's OpenGL implementation is generally speaking better and faster than Mesa
  • The dedicated nvidia GPU in my laptop is faster than the Intel GPU
  • I'm lazy and don't find myself in a spot where there are no power outlets often, so battery life isn't much of a concern

It certainly would be nice though in situations where there is a graphics-intensive application running in windowed mode. That way, the compositing would be done by the integrated GPU, whereas the application is offloaded, so if the offloaded application drops a frame or two, the compositor won’t be affected in how well the rest of the desktop behaves.

I registered just to have a say on this; there are currently a few different solutions. None of which are satisfactory!

  • Bumblebee - one of few solutions that works, mostly. But it's old.
  • Bumblebee + primus. Except primus is outdated, barely functional on newer distros
  • Output mode - Not a good solution for obvious reasons. I'm also completely unable to get this functioning on my GS70 Stealth unless it's to an external display
  • xrandr offload - same as above.
  • The only way for nvidia PRIME support to move forward is if Nvidia itself implements it! Nvidia have the source for their drivers, the specs, the know-how; so why the heck isn’t it done already? It’s been far, far too long. Linux is not a marginal OS anymore and deserves better support.

    I disagree. VirtualGL is broken, but primus still works very well with the latest stack.

    Regardless of what current hacked solution more or less works, they’re all really subpar in performance and each one has a modicum of issues. They’re quite obviously not the right way to be doing things. @luke-nukem is spot on: the only way this is going to happen is if NVIDIA implements it.

    Quite offensively, @aplattner told us to start a new thread about this, and then has completely neglected it. Is this how NVIDIA operates? Sequester & silence?

    Sorry I haven’t been able to reply earlier. I mostly asked for this thread to keep render offload discussion out of the thread about display offload so people trying to get display offload to work could use that thread.

    Render offload is quite complicated, so I don’t want to set any false expectations. It’s something we’re looking into, but I can’t promise anything or comment on it beyond that.

    Thanks for your response @aplattner. It’s nice to hear that you guys are interested in implementing it. I believe you will be successful!

    In case it wasn’t already obvious, the overhead of bumblebee/virtualgl/primus isn’t really acceptable.

    Intel Card:
    36639 frames in 5.0 seconds = 7327.653 FPS

    NVIDIA Card w/VirtualGL:
    19152 frames in 5.0 seconds = 3830.227 FPS

    NVIDIA CARD w/Primus:
    29772 frames in 5.0 seconds = 5954.200 FPS

    lspci|grep VGA

    00:02.0 VGA compatible controller: Intel Corporation HD Graphics P530 (rev 06)
    01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M2000M] (rev a2)

    I came to this thread only after reading the one regarding display offload.
    Don’t get me wrong, we really, really appreciate the efforts put in to the drivers for us. But we’re not okay with being un-equal or second-class to Windows in implementation support. Perhaps Windows makes off-loading easier? I don’t know, I don’t pretend to. But I do know that the Linux situation isn’t acceptable.

    Let me attempt to describe the absurdity of this situation to you; I just spent 2.5 days tiddling with a few distros and their implementations (Ubuntu, openSUSE, Manjaro, Sabayon).

    • Ubuntu - Using their own solution which is made up of a few python scripts, a binary program (for gfx detection), some variables in various files in /etc/, and it’s a log in/out situation. Basically we’d call it Ubuntu-PRIME.

      • You need to log out then in if you switch between Intel/Nvidia.
      • No vsync using Nvidia (since fixed in git, thanks guys!).
      • No power off for powersaving.
      • Bumblebee breaks Ubuntu-PRIME.
      • No easy offloading or dynamic switching.
    • openSUSE -A somewhat hostile to proprietary drivers distro. DKMS isn’t standard and they don’t see a need for it (Grrr). There are a few hacked together solutions; bumblebee through 3rd party repo, an Ubuntu-PRIME like solution (which I’ve failed to get working), or using nvidia-xrun which requires using a tty to use the script to start a new xserver using nvidia drivers. None, and I mean none of these are satisfactory.

      • Bumblebee as always, bad performance. Unmaintained (mostly), primusrun is ancient and fails with Tumbleweed (Leap seems okay due to older libs).
      • Bumblebee on Tumbleweed required installing a 3rd party build of the Mesa libs to be able to run Steam with it.
      • suse-prime, same as ubuntu-prime.
      • nvidia-xrun - try running an Unity built game with it, no mouse cursor, bare xserver, far too much work needed to get a satisfactory environment up. If you however, use a window manager and start your app from that, it seems okay. But still, you need to switch to a tty etc.
    • Manjaro - By far the easiest to use of the bunch, uses bumblebee - cuts performance.

    • Sabayon - Bumblebee problems as above.

    • primusrun and bumblebee no-longer play nice with Steam due to library problems. See openSUSE above.

    Seriously, it’s a freaking big mess. About two years ago when i first got my laptop (an MSI GS70 Stealth), using bumblebee and primusrun was fine, it worked and it worked okay-ish with performance at about half of what it could be. Then they became unmaintained, slipped behind the amount of changes that happened in the Linux world such as new GCC and Glibs. I’m lucky!!, if I get bumblebee and primus working acceptably across all use cases; and that is getting harder (try using Steam with it on a modern/rolling distro).

    There is no way in hell I’m using Windows to get decent use out of my laptop, that would kill my productivity (I’m a comp-sci & soft-eng student), not to mention that Windows itself is atrocious with its UI (and I can only run W8+on this).

    Myself, and likely, a very many others, a growing amount, use Linux exclusively and also use it for gaming. This number will definitely continue to grow but only, only if things such as driver installation for playing games is painless. Distribution installation and setup itself is relatively painless and likely even easier than Windows, this has improved in leaps and bounds over the last decade. Granted, basic nvidia only installation is as easy as next, next, next, it’s just Optimus support that is quite entirely lacking. Especially muxless, with external output connected to the nvidia chip. Heck, even output using intel-virtual-out relies on bumblebee.

    Literally the only way this is going to improve is if Nvidia itself improves it. Hackers don’t have the knowledge needed, often have to rely on reverse engineering etc.

    Sorry to reiterate my points from earlier, I felt I hadn’t really gotten my points across adequately. I really don’t know how to impress upon Nvidia and the lovely folks working on the Nvidia drivers, how important proper PRIME and easy(bumblebee-like) offloading is. There’s a huge amount of top-notch gaming laptops out there, and it sucks to be chained to Windows if you need to use the Nvidia gpu for anything requiring decent performance.

    In short, Nvidia needs to make a promise, a commitment to supporting Linux well beyond the bare essentials. It’s situations like this that hold Linux back, and there is bugger-all even very well intentioned and skilled hackers can do when Nvidia is the one holding all the cards.

    There are many, many more people around who are just endlessly frustrated, and have had the same experiences I have; See here…

    glxgears has never been an accurate benchmark to measure performance with bumblebee. You will see completely different results when you play games.

    @aplattner wrote:

    Any status update on this?

    Having support for this remains as crucial and important as ever.

    Thanks!

    Any update aplattner? It has been a few months. Thanks!

    I am running Bumblebee on Fedora 24 and I play Steam games all the time. My launch commands are:

    PRIMUS_SYNC=1 primusrun %command%
    

    If you have problems with running steam games, make sure your system has all the appropriate binaries. You can open a terminal and

    ldd <steam_game> | grep 'not found'
    

    to see if you are missing anything. 99% of the time, this is the root cause… not bumblebee.

    I think this is also important because it seems to fix the issue, that the laptop freezes if you are in nvidia mode and go into sleep mode, and it isn’t really nice to switch always between nividia and Intel mode and then log out and log back. I would definitely agree that this is quite important! I would say it is the biggest issue this time.

    Any news on this topic?

    I would really love to see render offloading finally land in the nvidia drivers.

    bump. Currently I’m using the nouveau drivers to realize optimus on my system and benefit from
    multimonitoring, but the performance is really bad.

    Any official statement would be highly appreciated.