nvidia-smi works, but no GLX or CUDA available for GTX 960

I’m running Linux Mint 18 KDE5 64bit on a PC desktop.

It has an onboard Intel GPU and a discrete MSI Geforce GTX 960.

For about 3 months I used Intel GPU for display and Nvidia GPU for Blender CUDA rendering and after initial hassle I managed to get it working stable.

I had some trouble lately, had to recover the system from backup and I can’t get it to work again.

I’m unable to use the Nvidia GPU - either for display or CUDA.

I’m using Primus to switch between Intel and Nvidia GPUs - display works fine on Intel, but if I switch to Nvidia - without restarting the X - newly started programs report no GLX extension.
What is strange - nvidia-smi works fine then and reports temperature for the GPU and other information.
If I restart the X session - Plasma Desktop reports no Open GL 2 support and it’s unusable.

After running

sudo prime-select nvidia

I am able to run nvidia-smi and this is it’s output:

$ nvidia-smi 
Thu Dec 29 10:08:11 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960     Off  | 0000:01:00.0     Off |                  N/A |
|  0%   35C    P0    24W / 120W |      0MiB /  1996MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

I installed Nvidia (and Intel) proprietary drivers thought the driver-manager GUI program available in Linux Mint - it only shown me the 367.57 version of the driver.

I installed nvidia-modprobe and nvidia-cuda-toolkit packages, but Blender still doesn’t detect the Nvidia GPU for CUDA rendering.

The nvidia-bug-report.log doens’t contain anything but uname output:

uname -a
Linux hostname 4.4.0-57-lowlatency #78-Ubuntu SMP PREEMPT Sat Dec 10 01:37:35 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

What can I do?

Beyond clearing the motherboard’s CMOS I have no idea of how to approach solving the main problem which you are citing. But here’s an easy way to install the latest nVidia driver:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

(Source)
Proprietary GPU Drivers : “Graphics Drivers Team” team
https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa

A tip for Linux Mint users who find it desirable or necessary to change from one nVidia driver version to another:

  • Via Driver Manager install xserver-xorg-video-nouveau

  • Reboot

  • Then install the desired version of the nVidia binary driver

  • Reboot

Related:

“First version that includes the fix: 375.26”

Updated 12/14/2016
Security Bulletin: Multiple vulnerabilities in the NVIDIA Windows GPU Display Driver kernel mode layer (nvlddmkm.sys) handler for DxgDdiEscape and a vulnerability in the Linux GPU Display Driver kernel mode layer (nvidia.ko) | NVIDIA
http://nvidia.custhelp.com/app/answers/detail/a_id/4278

(Source)
Nvidia Support | NVIDIA
http://www.nvidia.com/object/support.html

I installed the PPA and updated the driver to the newest one (375.26) using the Mint-friendly procedure.
The GPU is still unusable, though.

I booted to a black desktop with “Plasma failed to run” message box.

I had to run sudo prime-select intel and reboot, because the machine hung after running sudo service sddm restart (sddm is the session manager for kde5).

I wonder if this could be a hardware problem?

Did you try clearing the CMOS to eliminate it as a possible cause?

I’ve no experience with Linux Mint 18 KDE5 but I do have Linux Mint 18.1 MATE installed on one of my PC’s HDDs for evaluation. So far I’ve found '18.1 to be a tad sluggish and slightly buggy. IME the much more polished Linux Mint 17.3 MATE upgraded to kernel 4.2.x (4.4.x is also available) via Synaptic Package Manager is superior in terms of stability and speed.

Do you have an external USB-based HDD or flash drive upon which you could clean install in succession Linux Mint 17.3 KDE and if need be Linux Mint 17.3 MATE for further testing?

BTW. One new ‘feature’ introduced with Linux Mint 18 / Ubuntu 16.04 LTS is a daemon and its accompanying files which IIRC can automatically download and install motherboard UEFI updates. Since I won’t tolerate such a reach-around on my machine I delete this potential security threat immediately following a Linux Mint 18.x install.

Launch Synaptic Package Manager and search for and ensure the complete deletion of the following 10 files (which ones are actually installed depends upon how the UEFI was configured at OS installation time):

fwupd
fwupdate
fwupdate-amd64-signed
fwupdate-signed
libdfu-dev
libdfu1
libfwup-dev
libfwup0
libfwupd-dev
libfwupd1

NOTE:

The above file deletion approach does not work correctly in Ubuntu 16.04 LTS because libfwupd1 and ubuntu-software (which allows one to install additional applications and utilities from the Ubuntu Repositories) are co-dependent.

One more thing. I avoid ‘Secure Boot’ and its (IME) needless complexities which I suspect are complicit in a range of anomalous PC misbehaviour.

How I partition a Linux Mint HDD (of greater than 2TB in capacity, in gparted choose ‘gpt’ as the partition map to be applied to the drive prior to partitioning it) in a manner which saves time later on.

Via gparted create four partitions on a GPT drive (CSM / Compatibility Support Module enabled in the UEFI, no alleged *‘Secure Boot’):

  1. bios_grub = 1MB (the 1MB bios_grub partition is there to prevent older, pre-GPT-era disk repair utilities from attempting to ‘repair’ a GPT-era partition)

  2. / = (1024MB x 40 = 40960MB or 40GB, ext4)

  3. swap = 1GB more than the m/b’s max. RAM capacity (1024MB x 33 = 33792MB or 33GB, the required swap partition size in the case of the Sabertooth 990FX R2.0) to ensure a functioning resume from suspend and hibernate.

  4. /home (ext4) = the rest of the drive.

The above straight forward approach has not only supported resume from suspend on every AMD and the one Intel machine (an i5 750 / H57 Express, Gateway DX4831-07c) I’ve used it with, it also allows me to clean install an OS (after formatting the / partition and deleting the .name invisible preferences files and folders from the /home partition to keep them from mucking up a new OS) without having to delete /home (during an OS install mount /home as ext4 but don’t format it) and then be forced to restore personal files from a backup disk for hours on end.

Further I always install an OS’ GRUB on the same disk (or SSD) which that particular OS resides upon and then use the UEFI / BIOS’ boot menu to boot between various HDDs or SSDs (I have an OS on every drive in my machine to ensure that I can always boot the PC if an OS install or the drive it’s installed on fails). That way if one or more drives is or are removed from the machine the remaining drives will continue to be bootable and function normally.

What’s more, drives prepared in the above manner can be transplanted into a different machine (which also supports GPT drive capacities) and still boot and work correctly. This works best when going from AMD to AMD or from Intel to Intel but AMD to Intel or vice versa nearly always works as well thanks to the Linux kernel having an abundance of firmware and drivers for various constituent chipsets of both hardware platforms. (Caveats re nVidia vs AMD graphics drivers apply.)

*Secure Boot hacked
https://duckduckgo.com/?q=Secure+Boot+hacked&t=hu&ia=web

10 Aug 2016
*Bungling Microsoft singlehandedly proves that golden backdoor keys are a terrible idea • The Register
http://www.theregister.co.uk/2016/08/10/microsoft_secure_boot_ms16_100/

How I partition a Linux Mint SSD boot drive (of less than 2TB in capacity, in gparted choose ‘msdos’ as the partition map to be applied to the drive prior to partitioning it) in a manner which prevents premature wear.

  1. / = (1024MB x 40 = 40960MB or 40GB, ext4)

  2. /home (ext4) = the rest of the drive.

It is unnecessary to have a swap partition on an SSD boot drive as Linux Mint will automatically use the swap partition(s) on any connected HDD(s) partitioned in the manner described above.

I tried clearing the CMOS - there’s progress: I can use the GPU for display and CUDA rendering but it’s only detected by Blender when I’m using it for display too.

When I select Intel GPU in prime and restart X - Blender can’t see the Nvidia card.

What I need is Intel for display and Nvidia for CUDA, and it was working before, so I’ll be trying still to get that running.

I’ll try installing the older drivers (nvidia-367.57) that worked before, as I’m now using nvidia-375.26.

PS: This is a production machine so I want to avoid downtime as much as possible, hence installing a blank system is the least favorable option. Also that’s why I had to wait to try clearng CMOS, becasue I was afraid the machine won’t start at all (it’s almost new, but it scared me lately quite a bit).

UPDATE:

The older drivers still don’t work. I can use CUDA only when the card is used for display too.

All the more reason to consider using a secondary internal or an external drive as a freshly installed OS guinea pig to test prospective remedies. If you do so just be sure to assign that OS’ GRUB to its own drive so that you don’t compromise the GRUB on your PC’s primary / production drive.

Is the version of CUDA you’re using newer or older than this one?

CUDA 8.0 Downloads | NVIDIA Developer
https://developer.nvidia.com/cuda-downloads

In Linux Mint MATE there’s a control panel called ‘Users and Groups’ whose features include ‘User Privileges’ and ‘Manage Groups’ in which the the user can assign him or herself to various groups to gain full permission to use said groups. The user is usually assigned to the ‘video’ group automatically but since you are trying to use two different video systems simultaneously you currently may not have permission to do so. (It’s a long shot.)

BTW. Could you please add your rig’s system specs. to your forum signature? Someone else who happens upon this thread may have similar hardware and goals and thus some insight.

One more thing. Does Blender’s preferences allow for assigning functions to different video systems?

I’ve updated CUDA to the latest version: installed the 8.0 toolkit along what’s written on the page you linked. (adde a PPA and installed “cuda” package).

I also got back to the newest nvidia driver so they can play together nicely, but no change - I can use CUDA only when nvidia GPU is used for display.

Blender allows to select rendering device, a CPU or a CUDA-enabled GPU. But it doens’t detect my nvidia GPU when I’m using Intel GPU for display.

How much of a comparative performance hit does Blender suffer when the GTX 960 handles everything vs the previous arrangement of having the internal Intel graphics card handling display duties?

The reason I keep harping on doing a fresh (preferably 17.3) install on to an external drive as an experiment is to bypass whatever has changed for the worse in your current primary OS.

It would still be helpful if you were to post your rig’s system specs., model numbers and revision numbers (where applicable) are good, in your forum signature since doing so would allow myself and others to conduct a wider search of perhaps similar problems experienced by others who may be running the same hardware.

To edit your forum signature you may have to log out and then log in again.

I didn’t notice too much performance impact on Blender when Nvidia GPU handles both display and CUDA, but the whole system becomes unresponsive and I can’t do any work while rendering is in progress, the display drops to roughly 5 FPS at times. That’s why I used Intel GPU for display.

Here’s the machine specs (sorry for not including this earlier):

MB: ASRock Z97 Extreme6
CPU: Intel Core i5-4690K
GPU: MSI GeForce GTX 960 2GB
RAM: HyperX Fury Black 8GB [2x4GB 1600MHz DDR3 CL10 DIMM]
PSU: Chieftec GPS-500A8 [500W]
Display: 2x AOC I2276VWM [IPS, 1920x1080]
HDD: WD Blue 3TB

(I can’t change the signature - my password is rejected)

I’m not eager to install a new system for comparative tests because it’s very time consuming, and I’d like to avoid the downtime as much as possible. I have limited performance right now, but I still can do my work, that’s why I’ll try to do this when I can, but I need to find a right timeframe where there’s as little work to do as possible, when the downtime won’t be a big problem.

5 FPS? That would almost be like watching paint dry.

Two primary upgrades and one secondary upgrade worth considering to speed up your work flow.

  1. According to this the Quadro M4000 is recommended for use with Blender:

PNY - PICK YOUR QUADRO
http://pnyus.quadro-selector.com/en/all-software

nvidia quadro m4000 by pny datasheet_eng_web.pdf
http://www.pny-europe.com/data/products/brochures/nvidia%20quadro%20m4000%20by%20pny%20datasheet_eng_web.pdf

With its 256bit memory bus, 8GB of VRAM, 1664 CUDA cores, a faster GPU and pro graphics capable firmware, the Quadro M4000 offers a decisive upgrade over your current 128bit, 2GB, 1024 CUDA core, slower GPU and non-pro graphics capable firmware-equipped GTX 960.

As comparatively hefty as the Quadro M4000’s price is, the next model up is the 8GB of ECC VRAM-equipped Quadro M5000 which sells for more than twice as much. Thus within the context of your Z97 Extreme6 motherboard and Core i5-4690K (neither of which supports ECC RAM), the Quadro M4000 supplies the best overall value in terms of expected reliability, greater suitability for 3D content creation and performance matching with the rest of your rig at a price which should be measurably offset by increased productivity.

BTW. From what I’ve read on the differences between nVidia’s Quadro and GeForce product lines, the former’s firmware supports pro-level graphics features at a mild cost in speed while the latter’s firmware is deliberately knee-capped re pro graphics (which aren’t needed for games or general desktop duties) in favour of greater speed.

I’ve also read somewhere that even though the Quadro and GeForce cards both use the same drivers and GPUs (although the Quadro cards’ *lower failure rate suggests the use of cherry-picked components) it’s not a good idea to employ both card types in the same machine due to potential conflicts arising from driver reactions to the respective differences in the two cards’ firmware features. But I’d expect there’s other forum members here who know more about this than I do.

*"…As a whole, Quadro cards are about three times more reliable than GeForce cards…"

December 30, 2016
Most Reliable PC Hardware of 2016 - Puget Custom Computers
https://www.pugetsystems.com/labs/articles/Most-Reliable-PC-Hardware-of-2016-872/

(Thermal images)
Configure PC w/ PNY Quadro M4000 PCI-E 8GB Video Card
https://www.pugetsystems.com/parts/Video-Card/PNY-Quadro-M4000-PCI-E-8GB-11244

NVIDIA Quadro M4000
https://www.pny.com/nvidia-quadro-m4000?sku=VCQM4000-PB&type=m

PNY Technologies | SD Cards, USB Flash Drives, Memory Modules, SSDs, Graphics Cards
http://www.pny.eu/company/contact-us

The PNY Quadro M4000 ships with only a single Display Port-to-DVD-D SL adapter (1920 x 1200). However supplemental adapters are also available as a **single unit or as a pack of four (some of the prices being charged for these are quite outrageous. Shop around or better yet get the M4000 + a DP-DVI-QUADKIT-PB as a package deal):

**030-0173-000
DP-DVI-QUADKIT-PB

Reviews:

PNY Quadro M4000 VCQM4000-PB 8GB 256-bit GDDR5 PCI Express 3.0 x16 Full Height Workstation Video Card - Newegg.com
http://www.newegg.com/Product/Product.aspx?Item=N82E16814132051

As well the Quadro M4000 is also manufactured by:

NVIDIA Quadro M4000 - HP Store UK
http://store.hp.com/UKStore/Merch/Product.aspx?id=M6V52AT&opt=&sel=ACC

Nvidia Quadro M4000 8 GB GDDR5 DP x 4 by Lenovo | 4X60K59926 | Lenovo | UK
http://shop.lenovo.com/gb/en/itemdetails/4X60K59926/460/3E71FB3EB22D4A308B517832F7D2264A

  1. This RAM kit for channel two of your ASRock Z97 Extreme6 would substantially increase the amount of RAM Linux Mint can use as a buffer:

FURY Memory Black - 16GB Kit* (2x8GB) - DDR3 1600MHz CL10 DIMM
Part Number: HX316C10FBK2/16
Specs: DDR3 , 1600MHz , CL10, 1.5V, Unbuffered,
Timings: 1600MHz, 10-10-10, 1.5V
http://www.kingston.com/dataSheets/HX316C10FBK2_16.pdf

ValueRAM for ASRock Motherboard Z97 Extreme6
http://www.kingston.com/us/memory/search?DeviceType=8&Mfr=ASR&Line=Motherboard&Model=89174

  1. A secondary upgrade option which can speed up your system in most regards: An NVMe card (but there’s a ***catch)

ASRock > Z97 Extreme6
http://www.asrock.com/mb/Intel/Z97%20Extreme6/?cat=Storage

June 6, 2016
***Samsung 950 Pro M.2 Throttling Analysis - Puget Custom Computers
https://www.pugetsystems.com/labs/articles/Samsung-950-Pro-M-2-Throttling-Analysis-776/

June 6, 2016
***Samsung 950 Pro M.2 Additional Cooling Testing - Puget Custom Computers
https://www.pugetsystems.com/labs/articles/Samsung-950-Pro-M-2-Additional-Cooling-Testing-795/

Re editing a forum signature:

On this forum’s Windows-centric flip-side, forums.geforce.com, I have to log out and then log back in to edit my forum signature. Since everything I do over there shows up here I’d expect that devtalk.nvidia.com works the same way.

Wow. Thank you for that detailed analysis - I didn’t know that there is a specific GPU unit that’s recomended for Blender.

BTW: This rig is only about 5 months old now, so I guess I’ll have to wait for a major upgrade, but thank you for bringing these things to my attention.

BTW: There is sometimes power outage during weekends in the facility this machine operates in. I usually suspend the machine to RAM, at times I find it poweroffed, I wonder if I should equip it with a UPS unit.


I have restored my system from backup to a state from 70 days ago. And all works fine. I didn’t restore the user home directory, so something’s wrong in the system files.

I did the same thing a month ago, but when I installed a bunch of audio software and other tools - things got broken again very quickly.

I guess there’s some software interference, but I don’t know how to track that down yet. I’ll stay alert and hopefully narrow down what is causing problems here finally.

Good off-line UPS units are quite pricey and the on-line variety don’t always switch the PC from the AC to the battery in time to prevent a crash. It’s cheaper and better just to power the PC off for the weekend–and save your dough for a Quadra M4000. :-) As well those power outages while your machine was suspended-to-RAM may have corrupted some files thus contributing to the recent problem.

Are you still partitioning your HDD with just / and swap?

If so then the next time you have a need or opportunity to partition a drive, try the / swap /home scheme I outlined in post #4. It will segregate your various preference files in the /home partition from the system files in the / partition thus making it easy and fraught-free to cleanse the system of corrupted preference files.

I once buggered up the MATE UI so badly that I thought I’d have to re-install the OS. But before I did I tried deleting all the .name invisible preference files and folders in the /home partition and then rebooted. The result? I got the stock MATE desktop one gets after doing a fresh OS install sans the mess I had made of the UI.

I’ve been using 3 partitions:

/ - for system
swap
data - for all project files and binaries in particular versions I need or not installable form the system repos.

Now after restoring I’m moving /home to it’s own partition too and I made another empty partition to install a secondary system to testing in the future.

To power outages leading to corrupt system files sound like a quite probable cause, but it doesn’t explain why after fresh backup restore I was able to break it again so quickly during one day (no power outage happened).

I’ll be switching it off for weekends and stay vigilant and see what happens.

“Display: 2x AOC I2276VWM [IPS, 1920x1080]”

According to its manual the I2276VWM only has a D-Sub and an HDMI input.

I2276VWM AOC Monitor - AOC
http://aoc-europe.com/en/products/i2276vwm#support-download

Yet the ASRock Z97 Extreme6’s on-board video has DVI-I, HDMI and Display Port 1.2 outputs:

August 14, 2014
ASRock Z97 Extreme6 Motherboard
http://www.tomshardware.com/reviews/enthusiast-z97-motherboard-overclock,3893-2.html

How are you connecting your monitors to your system?

You’re right.

First monitor is connected via HDMI - HDMI cable.

Second monitor used a DisplayPort - HMDI adapter cable.
It was cheap and broke after 3 months of operation - don’t ask me how is that even possible.

I replaced it with a less-cheap DVI - HDMI adapter cable.

Have you tried swapping the cables to see if doing do makes a difference?

I prefer DVI-I DL and DVI-D DL myself because of their mechanically more robust design with integrated strain relief, although electrically HDMI has also proven to be reliable. But as for Display Ports? I’ve encountered too many help threads like these to be bothered abandoning older and tried & true technology:

04-03-2015
GTX 970/980 BIOS update for DisplayPort issues
https://rog.asus.com/forum/showthread.php?59850-GTX-970-980-BIOS-update-for-DisplayPort-issues

07/01/2016
GTX 1070 no signal when using displayport - GeForce Forums
https://forums.geforce.com/default/topic/948525/geforce-1000-series/-gtx-1070-no-signal-when-using-displayport/1/

Yup. I’m staying away from DisplayPort.

I swapped the cables once the DP- HDMI adapter failed to make sure it’s the cable faulty, not a monitor.

I’ve just installed a Lowlatency kernel (linux-lowlatency package).

All works fine. I just don’t have audio stutter when rendering with the Nvidia GPU that I had when running Generic kernel.

The user is not in the audio group, but still no audio dropouts in JACK, even with full GPU load.

Lowlatency kernel alone is not the cause of the problem.

Yeah, I prefer linux-lowlatency too ever since I encountered some Realtek ALC892 on-board audio glitching under Linux Mint 17.3’s generic 3.19 (IIRC) kernel.

BTW. Ubuntu 16.04.2 LTS is scheduled to be released on Jan. 19th. That may mean the impending availability of kernel 4.8.x in Linux Mint 18.1’s Synaptic Package Manager. It’ll be interesting to see if and how that shakes out.

Edit:

“February 2nd PointRelease Ubuntu Ubuntu 16.04.2”

XenialXerus/ReleaseSchedule - Ubuntu Wiki
https://wiki.ubuntu.com/XenialXerus/ReleaseSchedule