Hello,
I am monitoring CPU frequency during a stress test where I stress CPU + GPU on my Orin NX 8gb in 20W mode with jetson_clocks run.
During the test, I record the output of tegrastats.
While it’s running I see the CPU frequencies “dip”, but no Overcurrent (OC1/2/3) events recorded.
I also see corresponding events in dmesg saying:
[ 587.529068] cpufreq: cpu0,cur:1639000,set:1497600,set ndiv:117
[ 602.577336] cpufreq: cpu0,cur:339000,set:1497600,set ndiv:117
[ 603.680896] cpufreq: cpu4,cur:1379000,set:1497600,set ndiv:117
[ 615.623390] cpufreq: cpu0,cur:1617000,set:1497600,set ndiv:117
[ 622.356034] cpufreq: cpu0,cur:1368000,set:1497600,set ndiv:117
[ 624.475794] cpufreq: cpu0,cur:245000,set:1497600,set ndiv:117
[ 659.973306] cpufreq: cpu0,cur:1616000,set:1497600,set ndiv:117
Which seems to come from this function:
static unsigned int tegra194_get_speed(u32 cpu)
{
struct tegra194_cpufreq_data *data = cpufreq_get_driver_data();
u32 clusterid = data->phys_ids[cpu].clusterid;
struct cpufreq_frequency_table *pos;
unsigned int rate;
u64 ndiv;
int ret;
/* reconstruct actual cpu freq using counters */
rate = tegra194_calculate_speed(cpu);
/* get last written ndiv value */
ret = data->soc->ops->get_cpu_ndiv(cpu, data->phys_ids[cpu].cpuid, clusterid, &ndiv);
if (WARN_ON_ONCE(ret))
return rate;
/*
* If the reconstructed frequency has acceptable delta from
* the last written value, then return freq corresponding
* to the last written ndiv value from freq_table. This is
* done to return consistent value.
*/
cpufreq_for_each_valid_entry(pos, data->tables[clusterid]) {
if (pos->driver_data != ndiv)
continue;
if (abs(pos->frequency - rate) > 115200) {
pr_info("cpufreq: cpu%d,cur:%u,set:%u,set ndiv:%llu\n",
cpu, rate, pos->frequency, ndiv);
} else {
rate = pos->frequency;
}
break;
}
return rate;
}
What I don’t understand yet is if this is the CPU actually throttling/changing frequency, or if it’s a measurement artifact.
To prove whether or not its a measurement artifact I want to limit the scaling_available_frequencies to just 1497000 and re-run the test.
I run jetson clocks before the test so I know the CPU scaling should be locked, but I don’t know why I get those “dipping” events.
I see in Jetpack 6 there’s opp cluster tables in the device tree, but I don’t see such tables in jetpack 5.1.2.
What would be the best way to limit the scaling_available_frequencies so that I can prove whether or not the CPU freq is actually scaling or if its just tegrastats?