GeForce TITAN throttles itself at 80c

I’m trying to run some CUDA code on a shiny new GeForce TITAN under Ubuntu 12.04. The problem is that the card starts throttling itself as soon as it reaches 80c. The fan speed hangs around 55%. How do I get this thing to increase the fan speed rather than throttle itself? This is very unpleasant. I should mention that the card is not connected to a monitor, so attempts to turn on CoolBits and tweak the fan speed manually have failed.

Thanks.

Edit: I should mention I’ve tried various recent drivers. Right now running the latest beta, 319.12.

This is a very well known issue since the Titan was released, and is not unique to running CUDA code. I have had success with a modified BIOS in Windows and altering power profiles and it alleviates, perhaps even eliminates the throttling. Since the power settings are not as customizable in Linux, I’m not sure if the modded BIOS approach would improve the situation much, but it probably will… give it a shot, and post back your results. I’m eventually looking to test my CUDA code on Linux as well to see how it performs without the WDDM overhead.

Here’s a few sources for reading and links to some modded BIOSes that might be useful:
[url][Official] Nvidia GeForce GTX TITAN Owners' club | Overclock.net
[url]https://forums.geforce.com/default/topic/533422/geforce-drivers/maunelg-why-does-manual-fan-70-cause-downclocking-with-314-09-and-314-14-/1[/url]
[url]http://1pcent.com/?p=277[/url]

Cool, thanks for the info. Is there any prospect of a fix from Nvidia?

From what I can tell, probably not entirely, but the modded BIOS route has worked for most it seems.

Such a bizarre bug.

After some hacking around with xorg.conf, I was able to trick the driver into thinking I have a display connected to the Titan. This allowed me to manually set the fan speed to 85%, and the temperature went down to around 70c, and the card stopped throttling itself. But this also messed up my screen. So I don’t like this solution.

Interesting approach to a solution. Can you expand on how you did that? My google keywords seem to not get me anything useful on how to accomplish that.

Edit: I came across this: [url]http://forums.freebsd.org/showthread.php?t=34923[/url] One particular (albeit hacky) solution seems to be to physically emulate a VGA screen… perhaps that approach won’t break your X configuration? At any rate, post an acceptable solution when you’ve found one. :)

Basically I followed the advice here:

So I duplicated some sections of xorg.conf to make it think that the Titan is connected to a monitor.

This looks promising: https://sites.google.com/site/akohlmey/random-hacks/nvidia-gpu-coolness

But I haven’t been able to get it to work with my setup which has one real X server.

I had seen that last link you posted a few weeks ago, but couldn’t find it earlier. You mentioned that your screen is “messed up”. Are you able to disable the ‘fake’ screen from within the nvidia-settings GUI, or does that completely nullify being able to set custom fan settings? Just an idea.

I’m tempted to try the VGA emulator from the other link I mentioned – [url]http://www.bononia.it/~renzo/keap/VirXGA.pdf[/url], especially if I do testing and experience the throttling under Linux as well – I suspect it will be no different as you have found out yourself.

I’m somewhat surprised that raising the fan speed in Linux does not make the card throttle as it does in WIndows. That’s one of the issues that I experienced myself when I was looking for a solution and saw others confirm it in the owners thread I mentioned earlier.

If you’re comfortable, try flashing a modded BIOS on your Titan to see if it alleviates the issue without having to mess with fan settings. If you do, make sure you back up the original and use the native DOS/WIndows flashing app, though. I’ve flashed mine from Win 7 x64 successfully.

I wasn’t able to disable the fake screen. But I’m no longer running that config so I can’t test things out there now.

I would flash the BIOS, but there are so many to choose from. It seems like some of them also apply an overclock? I’m a bit lost here. I’d like to keep the clocks/voltages the same, just get the damn thing to raise its fan speed when it gets hot.

Yes, some of the BIOS’ have other settings altered because people want to overclock their card AND not have it throttle as NVIDIA advertised it… For your case, you might just want to raise the minimum fan speed and perhaps the max power limit. If you don’t care for flashing a BIOS someone else made, you can always backup your BIOS and change it with the Kepler BIOS Tweaker tool – it should be pretty straight forward. Of course again, that’s a Windows tool, so it might be inconvenient if you’re just running Linux.

All cards are slightly different which is why there’s so many versions floating because people are attempting to get the maximum clocks out of their particular one.

If you’re inclined, just read through and try a modded BIOS or just cook up your own. I’ve flashed mine with a few different ones with no issues and back to stock as well. It’s just a matter of testing what particular changes will work to keep yours from throttling… and as you have found it might be that the card behaves slightly differently under Linux vs Windows when it comes to throttling.

Thanks for the info. I don’t have easy access to a Windows machine so I’ll pursue this as a last resort when I become desperate.

Found a working solution!

Just a slight modification of Axel’s script to match my system.

The modifications:

  • modify line 67 of cool_gpu so that the regex pattern matches only my Titan.
  • modify line 74 of cool_gpu to use X screen :1 rather than :0, because :0 is used for my real display.
  • comment out the “Files” section of the dummy xorg.conf

Fan is now at 85%, temperature at 70c, Titan is not throttling, and my screen is not messed up.

Glad you found/documented a working solution. I might need to use it myself if I see the same issues later on. How/where do you call the script so that it launches on bootup on Ubuntu?

About 3/4 down the page, there’s an “Installing custom init-scripts” that seems to make sense:
[url]https://help.ubuntu.com/community/UbuntuBootupHowto[/url]

Actually I haven’t yet tried to make it run at boot. Axel speaks of using chkconfig, which doesn’t exist on Ubuntu. But Ubuntu apparently has these equivalents:

http://askubuntu.com/questions/2263/chkconfig-alternative-for-ubuntu-server

Hello,

I am having some problem with my card. Before I send to store for replacement, I would like to try this hack. Which instructions should I follow?Is it going to work for all versions of drivers?

Haven’t tried it myself, but it’s this page:
[url]https://sites.google.com/site/akohlmey/random-hacks/nvidia-gpu-coolness[/url]
with the modifications posted by TheSpoon on post # 13 of this thread. (assuming you’re running dual GPUs, if not it might be a bit different)

Hello,
Coulld you give more details about how yo do these steps?

I am running two Titans and am also curious about what you did. When I grep lspci for ‘VGA compatible’ I get

03:00.0 VGA compatible controller: NVIDIA Corporation Device 1005 (rev a1)
0a:00.0 VGA compatible controller: Matrox Graphics, Inc. G200eR2
83:00.0 VGA compatible controller: NVIDIA Corporation Device 1005 (rev a1)

so I changed the regexp on line 67 to

pciid=`lspci | sed -n -e '/VGA compatib.NVIDIA/s/^(…):(…).(.)./print$

(All I did was change nVidia to NVIDIA)

I believe my real display is also :0, so I changed it to :1 -Is this what you meant?

I commented out the “Files” section of the xorg.conf (why?). Incidentally, this file should just sit in the same directory as the cool_gpu script, right?

After adding the LSB information block to cool_gpu (see http://wiki.debian.org/LSBInitScripts) I start the service with

update-rc.d cool_gpu defaults

but when I restart the system, the GPUs are still undercooled. I’m assuming the fan should be high regardless of load. Any tips would be great, thanks.

What are the steps you have taken after installing your script. I have the same OS, same card, etc., but nothing can raise the Graphics Clock above 575MHz, the Memory Clock stays at 3004MHz. Where those the numbers you saw before your fix?
Thanks.