Performance counters reset itself

Dear All,

I am experiencing somehow unexpected behavior within accessing performance counters.

  1. I have downloaded files Tegra210_Linux_R24.1.0_aarch64.tbz2 and Tegra_Linux_Sample-Root-Filesystem_R24.1.0_aarch64.tbz2.
  2. I have flashed it to jetson x1 board according to this guide: http://developer.download.nvidia.com/embedded/L4T/r24_Release_v1.0/24.1_64bit/t210ref_release_aarch64/l4t_quick_start_guide.txt?autho=1464695602_7b8421ecb5e7e30bd746093f403fa143&file=l4t_quick_start_guide.txt
  3. I have run this test code several times.
    The code set up and start performance counter withing one of the processor cores. Reads the counter, sleep for a 1 sec and then it reads counter again. Returned values are meaningful only when the sleep() is commented out of the code. Otherwise read after sleep() returns 0.
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

/*
Compiled by: gcc test.c -o test -O0
*/
#define ARMV8_PMCR_E            (1 << 0) /* Enable all counters */

int main(int argc, char *argv[])
{
  uint32_t val;
  /* setup */
  
  asm volatile("mrs %0, pmcr_el0" : "=r" (val));
  asm volatile("msr pmcr_el0, %0" : : "r" (val|ARMV8_PMCR_E));
  asm volatile("msr pmcntenset_el0, %0" : : "r" (0x80000003));
  asm volatile("msr pmevtyper1_el0, %0" : : "r" (0x8)); // ins retired
  printf("Starting test...\n");
  /*first read*/  
  asm volatile("mrs %0, pmevcntr1_el0" : "=r" (val)); //read
  printf("Value of counter 1: 0x%x\n",val);
  /* sleep */
  sleep(1);
  /* second read */
  asm volatile("mrs %0, pmevcntr1_el0" : "=r" (val)); //read
  if(val == 0){
    printf("Test failed\n value of counter 1 is: 0x%x -> Counter has been disabled!!!\n",val);
  }else{
    printf("Test successful (Value of counter 1 is: 0x%x)\n",val);
  }  
  return 0;
}

Code can be compiled on the jetson with

gcc test.c -o test -O0

and run by

./test

or run and lock on specific core by

taskset 2 ./test

I would like to ask whether somebody have any experience with this behavior and kindly ask someone to verify this behavior.

Thank you for your time
Premysl Houdek

Hi Premysl,

most probably the counter is overflowing in 1 second delay.

Is that all of your code or just a sample for us to view?
Does it works fine when the sleep(1) is not used but maybe some other dummy code is used?
or you sleep/spin for lesser time, lets say in microseconds?

regards
Bibek

Hi Premysl,

We are getting counter values after sleep also when we ran your test code.
Please check if EN bit[0] is set in “pmuserenr_el0” register from kernel space(EL1) for all cpu’s.

$ ./test
Starting test…
Value of counter 1: 0x20e6c3b9
Test successful (Value of counter 1 is: 0x20e8f733)
$ ./test
Starting test…
Value of counter 1: 0x21054bff
Test successful (Value of counter 1 is: 0x21078e84)
$ taskset 2 ./test
Starting test…
Value of counter 1: 0x7b2dba
Test successful (Value of counter 1 is: 0x7d1a37)
$ taskset 3 ./test
Starting test…
Value of counter 1: 0xa366c0
Test successful (Value of counter 1 is: 0xa5716f)

Regards,
Sumit Gupta

Hello Sumit and Bibek

thank you for your attention. Have run the test on the original kernel? In my case it sometimes run correctly and most of the time it doesn’t. Would you mind to reflash the kernel and try it one more time?

Read from PMCNTENSET_EL0 register returns 0x0 after the sleep. That is why the counters are not counting anymore. I do not know why they are suddenly disabled and I cant find anything about it. All this behavior could be replicated in kernel mode with same result. I have posted user space aplication example just because it is little bit shorter.

I have tried to build my own kernel from sources provided by NVIDIA. I have modified the .config (disable/enable the perf API and so on) but the result was all the same. After many tries and reflashing the kernel I got a feeling that it does depend on the flashing itself. One version of the kernel worked but when I tried to reflash it with the SAME version counters stoped counting.

Counters do not overflow. If I use some busy loop instead of sleep it works. If I use usleep() it works for values less than 800. So, counters wont reset if delay is less than 800us.
Posted code and tar files are all that was flased into Jetson X1. Nothing else runned.

Přemysl

We are checking it. Will get back to you on this.

Hi Premysl,

Please use syscall perf_event_open() to get counter values from user space.
Below are some example code references.
http://www.carbondesignsystems.com/virtual-prototype-blog/using-the-arm-performance-monitor-unit-pmu-linux-driver
Manpage of PERF_EVENT_OPEN

Please check whether cpu id is same after sleep.

Regards,
Sumit Gupta

Hi Premysl,

PM counters resetting after sleep(1) because of cpu entering into power gating state of cpuidle driver.

Regards,
Sumit Gupta

In case you directly want to use pmu counters instead of above mentioned syscall, then please disable cpuidle or pass “nohlt” parameter to kernel command line.

  1. For disabling cpuidle: please comment CONFIG_CPU_IDLE in defconfig.
  2. For Passing nohlt: Before passing nohlt, please make sure “select GENERIC_IDLE_POLL_SETUP” is present in arm64 Kconfig.

Hello,

Is there something similar to perf_event_open() in the QNX environment. I want to do the same but under the QNX environment.

Regards
Sharat Raj

No, QNX is not supported on Jetson platform.

Thanks