New here, and I’m sure I have this post in the wrong thread, as I couldn’t find where it would fit.
I’m just trying to find help on how to manually set my fan speed on my Nvidia GPU.
nvidia-settings -a “[gpu:0]/GPUFanControlState=1” -a “[fan:0]/GPUCurrentFanSpeed=n”
gives a read-only error.
What can I type in terminal?
I’m running Lubuntu 18.10 with Nvidia 418.56 drivers.
Outside my area of expertise, but the various write-up on how to control the fan under Linux one can find on the internet typically state that you need to enable cool bits first before you are able to set fan speed with nvidia-settings. I see different idioms being used. A fairly common one seems to be
nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration
Proceed at your own risk. Why do you want to change the fan speed from the default?
While I believe Cool-bits=4 enables fan speed editing, Cool-bits 28 works as well.
However the gui is needed for that.
I guess I was hoping someone could post a terminal command, to modify fanspeed.
We need a headless (terminal based) variety of nvidia-xsessions at the deep learning and folding community.
I hope NVidia could support us with a terminal command in the next of their driver updates.
Something as simple as:
sudo nvidia-smi -i 0 -f 75
Just like adjusting the power levels, the above line stating NVidia smi card 0 (main card) set fanspeed at 75% (with values from 0% to 100%.
I’m not sure where you can enter those values, but it doesn’t work in my terminal.
Depending on why you use the GPU, if you have constant high GPU usage (like heavy gaming, or folding or crunching), the best thing you can do, is lower the TDP of the card.
It’ll still throttle the speed, but the card will run much cooler.
For reference, I have an open test bench, with an environment temperature of somewhere between 60-75F, and I can get 90-95% performance out of any RTX card running them at the following settings:
I raise the value by x watts, if the ambient temperature gets closer to 85-90F.
I lower the value by y watts, if the ambient temperature gets closer to 55F.
Lowering the power, reduces heat. When heat goes lower, the driver will increase the boost frequency.
Together with a fine tuned overclock, you can get near to 95% of performance, at 60-75% of the power consumption. I’m not sure why Nvidia tunes for maximum performance, instead of highest efficiency on the RTX line, as these cards run very hot (inside a case), and consume a lot of power.
The performance penalty (~5-10%) is well worth the reduction in heat (20-40C lower temps) and power consumption (25-40% lower per card).
If you have an intermittent performance demand on your card, water cooling is offering a better solution.
I am having similar issue in CentOS7.
[root@XX ~]# nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration
Using X configuration file: “/etc/X11/xorg.conf”.
Option “AllowEmptyInitialConfiguration” “True” added to Screen “Screen0”.
Option “AllowEmptyInitialConfiguration” “True” added to Screen “Screen1”.
Option “AllowEmptyInitialConfiguration” “True” added to Screen “Screen2”.
Option “AllowEmptyInitialConfiguration” “True” added to Screen “Screen3”.
Backed up file ‘/etc/X11/xorg.conf’ as ‘/etc/X11/xorg.conf.backup’
New X configuration file written to ‘/etc/X11/xorg.conf’
[root@XX ~]# nvidia-settings -a ‘[fan:0]/GPUTargetFanSpeed=100’
ERROR: Unable to load info from any available system
Has anyone been successful in pulling the GUI. I cannot get that to work either
[root@XX ~]# nvidia-settings
ERROR: Unable to load info from any available system
I would say those are temperatures in the normal range for a GPU operating under full load. Looking at my GPU’s temperature right now, it shows 85 deg C, while running at 92% load. Your CPU is likely running cooler because it is not anywhere close to fully loaded (or maybe it’s a very efficient CPU). The six-core Intel CPU in my workstation here is under 100% continuous load from 14 tasks and currently shows a temperature of 86 deg C.
If those are two identical Titans it seems they may not be getting equal air flow, you might want to try spacing them further apart if the PCIe slot assignments allow for this.
Under full load, the drivers will try to not fry your GPU, but the GPU certainly runs too hot at anything over 75-80C!
In terminal you can set the power curve: sudo nvidia-smi -i 0 -pl 150
Where ‘0’ is your primary GPU (and 1 would be your secondary, and so on…)
and ‘150’ is your capped wattage.
Overclocking and fan curve you can only access via X-server (which is accessible from the GUI).
To enable fan curve, type in terminal: sudo nvidia-config --enable-all-gpus
(this might crash your desktop on any newer than 18.04 Linux OSes if you have more than 1 GPU)
then do: sudo nvidia-config --cool-bits=4
where ‘4’ is the option to just enable fan curve, and 28 is the option to also enable overclocking.
I have many a times asked Nvidia to make a terminal command for fan curve and overclocking; as it won’t be too hard to implement (without accessing X-Config from the GUI); alas, Nvidia has never listened to any suggestions or feature request (like really, it’s probably an afternoon of coding for those who know what they’re doing, what will save hundreds of hours of people using the program…).