fan control for gtx-970 on Ubuntu 16.04

I am using Ubuntu 16.04 (Linux-x86_64) with an nvidia gtx-970 running driver 367.57, cudnn 5.1, and NV-CONTROL Version 1.29.

I have previously used xorg.conf to establish fan control on the gpu using the coolbits 4. Then a small script at startup initiates several things for me including some gpu settings as;
nvidia-settings -a [gpu:0]/GPUFanControlState=1
nvidia-settings -a [fan:0]/GPUCurrentFanSpeed=15
but this seems to be ineffective lately.

With the current setup, the Nvidia Server Settings allows GPU Fan setting adjustments. The ‘Enable GPU Fan Settings’ box is checked and the slider changes the fan speed. Though the fan starts with boot at 22% and does not change from any other control mechanism.

The problems are that changing the fan from the terminal no longer works and the fan seems unresponsive to the driver itself. Entering the 2 commands above returns;
nvidia-settings -a [gpu:0]/GPUFanControlState=1 —> Attribute ‘GPUFanControlState’ (jakiUbuntu:0[gpu:0]) assigned value 1.
nvidia-settings -a [fan:0]/GPUCurrentFanSpeed=15 —> ERROR: The attribute ‘GPUCurrentFanSpeed’ specified in assignment ‘[fan:0]/GPUCurrentFanSpeed=15’ cannot be assigned (it is a read-only attribute).

I read lots of posts about the subject and decided to believe one that said that the driver will in fact increase the fan speed when the temperature increases above 60C. Yet, I used some GPU computations to increase GPU utilization to >95% & >150W, the temperature increased above 72C and the fan remained at the initial speed of 22%… indicating that the driver is not controlling the fan either.

It would be nice to have confidence in the driver to control the fan speed, and even nicer to be able to rely on a script to customize the settings. Though, it seems unacceptable to have a high-end GPU that could so easily overheat. ( i have not tested whether the power or clock is adjusted at temperatures close to shutoff ).

Any suggestions on getting the fan to behave better? Is this an issue that is known to Nvidia?

Found this;

Attribute ‘GPUTargetFanSpeed’ (jakiUbuntu:0[fan:0]): 19.
Attribute ‘GPUCurrentFanSpeed’ (jakiUbuntu:0[fan:0]): 19.
Attribute ‘GPUCurrentFanSpeedRPM’ (jakiUbuntu:0[fan:0]): 844.
Attribute ‘GPUFanControlType’ (jakiUbuntu:0[fan:0]): 2.
Attribute ‘GPUFanTarget’ (jakiUbuntu:0[fan:0]): 0x00000007.

So the assignable attribute is ‘GPUTargetFanSpeed’.

Using;
nvidia-settings -a [fan:0]/GPUTargetFanSpeed=15
works as intended. Not sure when it changed.

For testing purposes try seeing how your card’s fans behave with the recommended nVidia driver version under a clean OS install sans any modifications.

Ubuntu 16.04.1 LTS (Xenial Xerus)
http://releases.ubuntu.com/16.04/

Valley | UNIGINE: real-time 3D engine
https://unigine.com/products/benchmarks/valley/

Heaven | UNIGINE: real-time 3D engine
https://unigine.com/en/products/benchmarks/heaven

According to MATE Sensors Applet 1.12.1 in 64-bit Linux Mint 17.3 MATE, my GTX 960’s fan speeds are auto-regulating and cap the GPU’s temperature under load at ~ 60C - 61C without any intervention on my part regardless of which supported version of the nVidia driver I have used. The same should be true for your GTX 970 (within a few degrees C since it’s a more powerful card).

BTW:

“Maximum GPU Temperature (in C): 98 C”

GeForce GTX 970 | Specifications | GeForce
http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-970/specifications

Tip:

To make Unity more responsive, install CompizConfig Settings Manager and turn off Animations and Fading Windows.

Thanks for the input. Not really interested in a fresh install of the operating system, but I’ll put the option in memory.

I made a little ‘GPU load indicator’ and placed it on the Ubuntu menu bar. It shows the temperature, fan speed, wattage, and utilization. After I ran some GPU intensive computations I watched the temperature climb to >70C without change in the fan speed.

Added an unsophisticated fan speed algorithm to the indicator app to control the fan based on temp (what else?). So, the problem is subverted, but seems unnecessary. Fan speed seems like a fundamental attribute that should be easily controlled by the driver.

“Fan speed seems like a fundamental attribute that should be easily controlled by the driver.”

It is on my system. But then I just let the driver do its job minus any interference on my part.

BTW. Years ago I started partitioning HDDs with separate / swap and /home partitions which makes doing a clean OS install a breeze (after I’ve booted from a loop-mounted install .iso via *grml-rescueboot residing on another HDD or from an install DVD and have formatted the / partition and have deleted all of the invisible preference files and folders from the /home partition of the target drive).

*Grub2/ISOBoot - Community Ubuntu Documentation
https://help.ubuntu.com/community/Grub2/ISOBoot

The payoff from doing so is that I can do a clean OS install, update and configuration in about 40 minutes minus having to spend hours copying terabytes of files from a back-up drive.

After much time consuming experimentation I hit upon the following partition sizes which have served me well for the past three years or so:

/ = 40GB (1024MB x 40) This leaves plenty of room for temp, log and image files etc.

swap = a motherboard’s max. DRAM capacity + one additional GB (on mine it’s 1024MB x 33) to accommodate resume from suspend.

/home = the remainder of the drive.

Is your GTX 970 made by nVidia or is it made by another company? Either way have you checked for any possible firmware / BIOS updates? Asus still still offers one for their STRIX GTX 970 for example.