Anyone know how to monitor the GPU MHz?

I’d really like to monitor the clock speed of the TK1’s GPU.

I think the GPU is topping at around 550 MHz on my device.

Any ideas on how to either monitor the GPU MHz or reach the full 852 MHz?

On a desktop ubuntu you can check the clock speeds in the powermizer setting of nvidia-settings.
In a console you can get the info by running: nvidia-settings -q GPUCurrentClockFreqs
Try ‘nvidia-settings -q all’ for a list of all attributes.

Good tip but no nvidia-settings on my TK1. :(

It’s tough to optimize without knowing your approximate clock speed.

/proc/misc is interesting:

One trick is to query the device properties and launch a full grid of threads that spin for “props.clockRate * milliseconds” cycles. The .clockRate property is in kHz.

On the desktop this computed spin duration seems to closely match the desired duration.

But what it shows on the TK1 is that, when idle, the GPU is clocked very very low (as it should be).

It also shows that the TK1 is reluctant to upclock the GPU when given a long duration kernel like this. Maybe certain GPU counters need to be triggered.

I’m wasting my time trying to deduce stuff like this and really wish there was a paragraph explaining under which situations the TK1 will be up/down-clocked.

Maybe the kernel sources have some hints…

I don’t have one of these boards (yet) but is it possible to compile nvidia-settings for the system, or does the program simply not contain a necessary reference to the TK1?

I’ve hunted for it and am pretty sure the libraries (NVML?) aren’t on the TK1 and that the Ubuntu ARM 6.5 CUDA Toolkit in beta is not compatible.

I guess we have to hold tight until an OS image with the 340.xx driver shows up?

Determine GPU rates available, in kHz:

# cat /sys/kernel/debug/clock/gbus/possible_rates
72000 108000 180000 252000 324000 396000 468000 540000 612000 648000 684000 708000 756000 804000 852000 (kHz)

Set a rate (e.g. 648000), in Hz:

# echo 648000000 > /sys/kernel/debug/clock/override.gbus/rate
# echo 1 > /sys/kernel/debug/clock/override.gbus/state

Verify the rate:

# cat /sys/kernel/debug/clock/gbus/rate

Note, those commands are using a debug interface; it is subject to change in future software releases, and may introduce side-effects or instability depending upon use.

Thanks @kulve!

I can now accurately benchmark my HotSort comparison sorting kernels.

For 32-bit keys:

That’s very good!

However a single block launched on a single GK208 SMX reaches a throughput of ~320 Mkeys/sec. when sorting 32K 32-bit keys.

The GK208 DDR3 has twice the registers but slightly less bandwidth so I was expecting the K1 to reach ~300 Mkeys/sec. at 16K 32-bit keys.

But I’m happy with these numbers and will try to tune up the kernels as I learn more about the K1’s GK20A GPU.

Added some login tweaks here for USB autosuspend and locking the GPU to max MHz.

Note that you need to have an HDMI monitor attached. The overrides didn’t work for me in headless mode via an SSH session. Maybe someone can explain why that is.

so those clock rates are for the shader clocks?

What about the memory clocks?

@allanmac, nice tweak! :D

You can override the memory clock by substituting “emc” for “gbus”.

I just tried it. However, I’m seeing no change in my benchmarks.

My assumption was that the MEM clock will scale up/down very quickly and that seems to be the case.

If you’re interested, you can see the entire DVFS table here:

sudo cat /sys/kernel/debug/clock/dvfs_table

Hi q10, the easiest way to get something running on bootup in a Linux system is typically to put it into your “/etc/rc.local” file using root permissions. See my instructions at http://elinux.org/Jetson/Jetson_TK1_Power#How_to_run_a_command_with_root_privileges_temporarily_or_on_every_bootup. This will get it working even without a HDMI monitor and without waiting for the user to login.

q10, that is strange. If you read those values, have any of them changed? eg: after booting up in headless mode, run this as root user:

echo /sys/kernel/debug/clock/override.gbus/rate
echo /sys/kernel/debug/clock/override.gbus/state
echo /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
echo /sys/devices/system/cpu/cpu0/online
echo /sys/devices/system/cpu/cpu1/online
echo /sys/devices/system/cpu/cpu2/online
echo /sys/devices/system/cpu/cpu3/online
echo /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Perhaps something resets these values to defaults later in the bootup process, or maybe those settings don’t work without a HDMI monitor. Tell me what the output of those lines above say, then I’ll forward it to people at NVIDIA.

If you are running headless and reboot and then find that you’re getting an “echo: write error: No such file or directory” the trick is to is to prod the GPU at least once with a CUDA application. A simple deviceQuery will do.

To be more clear… if you reboot a headless TK1 you cannot override the MHz and querying the gbus/rate indicates that the GPU is running at 12 MHz.

If you then run a CUDA application you can then go back and override the GPU MHz.

@Shervin, this workaround seems like something that can be easily fixed or left as is but explained. :)

Several people at NVIDIA have tried headless mode with those changes and it worked for us, so unfortunately it looks like a behaviour that is only effecting some. If it is just a matter of needing to run a CUDA or GPU application first, then can one of you test if running deviceQuery in your “/etc/rc.local” (just before the flags are changed) solves the problem? If so I’ll update the Wiki.

Also, a user Santyago suggested to change the “/etc/X11/xorg.conf” file at https://devtalk.nvidia.com/default/topic/755668/embedded-systems/jetson-tk1-l4t-xserver-without-display/, but it sounds like you guys just need to run a CUDA app to fix your problem.

I’m running mine in headless mode -

This is from my rc.local

Turn on Jumbo Frames

ifconfig eth0 mtu 9000

System Resources for Video and CPU

echo -1 > /sys/kernel/debug/tegra_hdmi/hotplug
echo 4 > /sys/class/graphics/fb0/blank
echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
echo 1 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online

using your above settings I have created two scripts to turn the gpu speed to min and to max - seems to work with the headless mode described here.

I found the cpu online / offline and graphics tegra hdmi / fb0 blanking on different various posts on here and they seem to work quite well.