Strange performance results when modifying the CPU frequency on the TX1

Hi,

With some colleagues, we have observed some strange performance behaviours on the TX1 during our experiments: we study the evolution of performance and energy consumption of parallel programs depending on the frequency and on the number of cores.

With a simple matrix multiplication code, on one core, the code execution time (in blue in the following graphs) decreases linearly with the frequency on the TK1, which is the expected results while this is not the case on the TX1: the resulting curve is broken into three different segments…

http://www.univ-orleans.fr/lifo/Members/Sylvain.Jubertie/tk1.png
http://www.univ-orleans.fr/lifo/Members/Sylvain.Jubertie/tx1.png

The same behaviour occurs when using 1, 2 or 4 threads, the curve is always broken at the same frequencies (around 800Mhz and 1.3Ghz).

Did someone observed the same behaviour ?

Are you taking into account various power saving or reduced power modes? Set to max without any kind of throttling and see if things go up linearly after that (or just set to some non-variable clock speed). See:
http://elinux.org/Jetson/TX1_Controlling_Performance

There should also be a script “jetson_clocks.sh” in user ubuntu’s home directory to work with this.

We set the governor to userspace and modify the frequencies using the same command lines as described in your link. Of course everything is automated, but I double-check manually…

One thing which may surprise people trying to control performance settings is that there is an actual init script function which will get in there and interfere (I’m not positive, but I think this is from the Ubuntu init scripts and not necessarily part of what nVidia provides). To remove this automatic scaling:

sudo update-rc.d -f ondemand remove

After checking all the board parameters, the strange behaviour occurs when the memory is set to a low frequency… After raising it to the maximum value, results are comparable to those obtain on the TK1.
Still, we will need to investigate what happens exactly at low memory frequencies…

I can’t tell you what is going on, but there are multiple clocks and voltages all tied together in tables. I suspect that you’re seeing the result of a complex mixture of settings which only looks like a simple frequency change (think of the frequency as being a bookmark into a large number of settings…you’re only seeing the bookmark title, not the content).