So, I am working with nvidia-smi command to get Nvidia GPU power-cap value. It’s working fine for some of the laptops but, When I tried on some other set of laptops like Lenovo legion it is returning very weird values like below.
Command: **nvidia-smi --query-gpu=driver_version,name,power.limit,power.draw --format=csv, noheader**
output: **531.18, NVIDIA GeForce RTX 3060, 4294967.50w .**
So there are 2 scenarios.
when I am on battery power it returns 80W for Power.limit value but,
when I connect to main power supply it returns 4294967.50w .
In my case both values are wrong because my GPU actual TDP is 130W.
Could you guys please help me understand why it is returning these weird values and how can I get the real value.
If you observe the power.limit value is returning some impossible number.
Hello @vijay88531 and welcome to the NVIDIA developer forums.
The 80W might very well be correct, depending on the capacity and discharge settings of your battery and the overall system parameters. 130 TDP means maximum Total Device Power the GPU can reach, it does not mean that it will use that all the time, especially not on battery usage.
The second value of course is incorrect. That can be influenced by many things. Wrong driver version for your device, old BIOS, old or wrong chipset drivers, missing Windows updates, etc. I would recommend that you make a clean install of the GPU drivers after checking that everything else in the system is up to date.
I hope that helps!
Hi Markus,
So as per your analysis 80W is the correct value because its on battery . But when I try nvidia-smi direct command where it gives all values in a square box there I was able to see 130W under pwr cap even on battery power. Actually I wanted to retrieve that value only. So am I using wrong variable to access pwr cap value. For now I am using power.limit parameter.
Well, there is a much easier way to query power consumption. Did you try nvidia-smi -q -d POWER
yet?
==============NVSMI LOG==============
Timestamp : Fri Mar 10 17:59:22 2023
Driver Version : 526.47
CUDA Version : 12.0
Attached GPUs : 1
GPU 00000000:0A:00.0
Power Readings
Power Management : Supported
Power Draw : 30.23 W
Power Limit : 320.00 W
Default Power Limit : 320.00 W
Enforced Power Limit : 320.00 W
Min Power Limit : 100.00 W
Max Power Limit : 370.00 W
Power Samples
Duration : 16.96 sec
Number of Samples : 119
Max : 35.44 W
Min : 29.71 W
Avg : 30.04 W
I have a Desktop 3080 so I have higher values. But basically this is the overview that shows all details. If you use the --query-gpu
command line, just try to remove the power.limit
because I think your original output with the 4294967.50W simply had a formatting error.
Hi Markus,
Thanks a lot for the reply I will try this command and update here if I gets the proper value in one of the field.
Also forgive me for asking different question in this discussion. but also, could you please help me to get the details of CUDA cores, RT and Tensor cores if possible?
Thanks a lot, for your help.
Glad to be able to help.
You need to be a bit more specific about what you want to know. Is it temperatures? Then that is ot broken down for the different cores.
If it is about number of each of them, you will find that in the spec sheets (or on Wikipedia, but don’t tell anyone :-) )
Many of the available third party HW status tools (HWInfo e.g.) also summarize the different HW capabilities of common GPUs.
I think the governor nvidia-powerd is overwriting the manual configs. So just first run:
$ sudo systemctl stop nvidia-powerd
and I recommend not to disable it because if any problem accurred just get things good after restart and if you ok and sure about your settings just write “disable” instead of “stop”.