Nvidia-smi -gtt doesn't work 535.104.05

ewbteewbte · August 26, 2023, 1:02pm

Arch Linux Desktop PC
6.4.12-zen1-1-zen
Happens on both open and proprietary versions of 535.104.05
GPU 0: NVIDIA GeForce RTX 3060 Ti (UUID: GPU-ba73bc75-4c91-6012-1365-c8e673737f6b)

Steps to reproduce:

nvidia-smi -gtt 65
run any heavy graphical app

Expected behavior:
GPU starts to throttle at set temperature, temperature doesn’t rise above set value.

Previously setting did work as expected.

Upload on this forum never works for me so here is nvidia-bug-report:
https://github.com/NVIDIA/open-gpu-kernel-modules/files/12440800/nvidia-bug-report.log.gz

Original thread on github:

github.com/NVIDIA/open-gpu-kernel-modules

nvidia-smi -gtt doesn't work

opened 03:08PM - 25 Aug 23 UTC

closed 12:56PM - 26 Aug 23 UTC

kndgs

bug

### NVIDIA Open GPU Kernel Modules Version 535.104.05 ### Please confirm t…his issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver. - [ ] I confirm that this does not happen with the proprietary driver package. ### Operating System and Version Arch Linux ### Kernel Release 6.4.12-zen1-1-zen ### Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels. - [X] I am running on a stable kernel release. ### Hardware: GPU GPU 0: NVIDIA GeForce RTX 3060 Ti (UUID: GPU-ba73bc75-4c91-6012-1365-c8e673737f6b) ### Describe the bug GPU Target Temperature option is ignored. Regardles of value set GPU goes up to maximum temperature. This didn't happen on previous version of the driver. ### To Reproduce nvidia-smi -gtt 65 run any heavy graphical app ### Bug Incidence Always ### nvidia-bug-report.log.gz [nvidia-bug-report.log.gz](https://github.com/NVIDIA/open-gpu-kernel-modules/files/12440800/nvidia-bug-report.log.gz) ### More Info _No response_

amrits · August 28, 2023, 9:20am

@ewbteewbte
Thanks for writing to us, I have filed a bug 4260165 internally for tracking purpose.
I will try to replicate issue on my test system first and update on further proceedings.

amrits · August 29, 2023, 7:35pm

Setup - Dell Precision T7610 + Genuine Intel(R) CPU @ 2.30GHz + Ubuntu 22.04.1 LTS + kernel 5.19.0-46-generic + NVIDIA GeForce GTX 1650 SUPER + Driver 535.104.05 + Display DELL G3223D
I tried below steps and seeing temp throttles to 66-67 at maximum, can you please confirm if you are seeing similar range or it increases further in your setup.

Run command “nvidia-smi -gtt 65”
Launched 5 instances of Unigine heaven benchmark and GPU temp max throttles to 66-67
Tried above 2 steps couple of times and observed same behavior.
Later I rebooted system and then ran 5 instances of benchmark, temp quickly throttles to 74-75C

ewbteewbte · August 30, 2023, 1:24pm

It does increase up to 87C, after which I quit benchmark app or game because I don’t want to risk the chance of damaging my GPU.
Normally I use “-gtt 80” and my GPU never surpassed 80C since I bought it a year ago.
Anyway, I provided the log file, shouldn’t it be enough?

amrits · September 4, 2023, 2:13pm

@ewbteewbte
Please confirm benchmark which you tried.

ewbteewbte · September 4, 2023, 2:24pm

unigine superposition, elden ring, dark souls 3

amrits · September 6, 2023, 10:52am

@ewbteewbte
Do you know the last passing driver where issue doesn’t persists.

ewbteewbte · September 8, 2023, 11:41am

Hard to tell, last time i used demanding apps was in April or May.
I run all games with 60fps limit, lately I was only playing games like project zomboid which do not utilize gpu much so I couldn’t notice the change in behavior until I tried gpu heavy games again.

amrits · September 11, 2023, 7:29am

@ewbteewbte
Is it possible for you to test with 530 branch driver or even bit older to see if problem exists in earlier branches as well.
I am still not able to repro issue on my couple of test systems.

ewbteewbte · September 11, 2023, 10:13am

It is not possible. Could it be 30series specific?

ewbteewbte · September 19, 2023, 11:40am

@amrits is it tied to coolbits setting? I always used “12”

amrits · September 26, 2023, 10:11am

@ewbteewbte
I am seeing similar behavior with + Arch Linux + kernel 6.4.12-arch1-1 + NVIDIA GeForce RTX 3080 + Driver 535.104.05 where GPU temperature peaks around 74 after running Unigine Superposition benchmark.
Shall check for the cause and update.

ewbteewbte · September 26, 2023, 10:58am

Good! Just in case, with 535.113.01 issues is still present.

ewbteewbte · November 4, 2023, 12:57pm

545.29.02 issue is still present.

ewbteewbte · November 26, 2023, 6:12pm

545.29.06 issue is still present.

ewbteewbte · February 28, 2024, 4:24pm

550.54.14 it does attempt to lower the gpu clocks a little bit but temperature is still able to surpass the value set by -gtt
(for instance from 2100Mhz it drops to about 1900 something)

amrits · March 4, 2024, 1:26pm

Hi All,
We have analyzed the issue from our local repro and observed that thermal policy is functioning as per the expectations. However, the workload is too intense to further reduce the temperature. Accordingly, we need to revise the thermal settings.

ManniX-ITA · June 23, 2025, 12:27pm

@amrits

The issue is still present in 560.35.05.
I guess it isn’t fixed in release 570 as well.

Can you please acknowledge the issue, raise the severity and fix this?
It’s been 2 years already and I find it really unacceptable.

As told already by others, the temperature target is almost disregarded which makes it useless.
Please look at how the thermal policy works in the Windows drivers and replicate the same.

The GPU temp in Windows even with very intensive workloads doesn’t go more than a few degrees over the target.

I use an RTX3090 in my NAS, Debian Bullseye, for inference; very high thermal constraint due to the small form factor.
The GPU the target temp (65c), is not even remotely respected, it goes over by 15c-20c.
This is doing translations with a 4b gemma3 model on Ollama which doesn’t cause more than 40% GPU utilization.
It’s not a benchmark, it’s not even a very intensive tensor workload.

This is the only way to reliably limit the temperature without depending on external factors.
Right now, I’m forced to use the power limit and, of course, I have to adjust the limit based on ambient temperature.
Which means that I will discover it’s too high only after a crash.
Not only that, the power limit as well is completely bugged.
Right now I have to set 175W and below 200W the performances drops massively.
The GPU clock is almost always around 400-500 MHz, with short burst to 1500 MHz, and despite that the temps are still high; it’s just plainly horrible.

Only one option left, using the -lgc switch to limit the clock.
It works perfectly and with a minimal performance drop (the temperature during inference goes up mostly because of the boost clock).
I can set 0,1710' and keep even in the worst conditions decent temperatures (my target is max 75c) with excellent performances.
But even the clock limit it’s bugged!
It’s my home NAS and the idle power consumption is crucial, I don’t like and I don’t want to waste energy.
The lower limit at 0, as expected, should allow the GPU to go in idle, 0 clock. But it doesn’t.
The clock in idle is 200 MHz and something; which means 25W in idle instead of 11W.
Additional 14W in idle for absolutely nothing, 24h, it’s a crime.

Please try to fix it, you could think it’s irrelevant but it’s crucial for a large part of your customer base.

Topic		Replies	Views
Nvidia-smi -gtt option since 460.27 causes major performance issues on laptops Linux	5	4035	April 30, 2021
(Linux) Nvidia GPU Temperature limit command doesn't work Linux	0	1080	February 21, 2022
GT 525M on Linux reports the temperature as 234C which causes the system to throttle Linux	1	599	June 20, 2019
Nvidia-smi GPU T.Limit/ GPU Shutdown T.Limit Temp Drivers - Linux, Windows, MacOS	1	1753	January 16, 2025
Severe throttling on Thinkpad T14 Gen 1 with GeForce MX330 Linux linux , gpu	11	5575	December 25, 2022
GPU throttling? Video Processing & Optical Flow	1	773	November 18, 2019
Diff (Windows/Linux) Target Temperature ---> 87 C -> 80 C [After Reboot] Linux linux , nvidia-smi , linux-driver , windows-driver	2	105	May 18, 2025
NVIDIA 1660ti throttling despite <60C temps Linux	1	533	December 18, 2024
My tgp is locked at 60 wont go above it Linux ubuntu	0	850	July 9, 2023
Nvidia-smi GPU target temperature / Maximum Operating Temperature Drivers - Linux, Windows, MacOS	4	10774	May 2, 2025

Nvidia-smi -gtt doesn't work 535.104.05

Related topics