pgcollect: Couldn't allocate hardware counters...

After my (now-solved) look at pgcollect and the accelerators, I decided to try out pgcollect and OProfile. Since I don’t (currently) have a PAPI kernel, I thought that would be the next logical step. In doing so, though, I find myself unable to do anything other than -hwtime= as an option.

First, my machine is a Core 2, so it should be able to do -dcache and -imisses (according to pgoprun). But when I try either one:

> pgcollect -dcache runsorad-vector32.exe
Couldn't allocate hardware counters for the selected events.
pgprof: could not set oprofile event profiling mode

I’ve also tried defining random events in .pgoprun:

cpu_clock () {
   event[${#event[@]}]=--event=CPU_CLK_UNHALTED:299200:0x00:0:1
   event[${#event[@]}]=--event=MUL:299200
}

and then running it:

> pgcollect -es-function cpu_clock runsorad-vector32.exe
Couldn't allocate hardware counters for the selected events.
pgprof: could not set oprofile event profiling mode

Now, I’m guessing this means there is something hinky with my OProfile setup, but I’m not sure how to test that against a pgcollect issue. I have confirmed I have sudo control of opcontrol (and I doubt -hwtime= would work otherwise), and -list-events does confirm “Core 2”. So I thought I’d ask here first before moving to the OProfile help list.

Thanks,
Matt

Matt
Sorry you are having trouble with pgcollect. Here are a few things you can try:

(1) Check that the “opcontrol” command is in your path. Presumably it is if -hwtime is working. pgcollect uses the OProfile command “opcontrol”.

Also, there may be some funny business going on with sudo. Try running a bash shell and using ‘type -p opcontrol’, then ‘sudo type -p opcontrol’ and make sure the results are the same.

(2) run ‘pgcollect -list-events’. At the very top of the output from this command it should list what OProfile thinks your processor is. If on your Core 2 system, it doesn’t say something like:

oprofile: available events for CPU type “Core 2”

then there may be a problem with your Oprofile installation.

(3) Also check that the list of available events contains “L2_LINES_IN” and “MUL”. If (2) checks out OK, I’d be surprised if (3) didn’t.

(4) Run opcontrol --version

Please post your results here. If the items above don’t reveal anything there are other things we can try.
thanks
–Don

In bash:

> which opcontrol
/usr/bin/opcontrol
> type -p opcontrol
/usr/bin/opcontrol
> sudo type -p opcontrol
Password: 
sudo: type: command not found

I’m not surprised by the type command as I believe my sysadmin just gave me passwordless access to opcontrol (per the example in the Tools Guide) and I have no “generic” sudo password. Is there a way to gain sudo passwordless access to bash internal functions? (Note: for other reasons, I need to use tcsh as my interactive shell.)

(2) run ‘pgcollect -list-events’. At the very top of the output from this command it should list what OProfile thinks your processor is. If on your Core 2 system, it doesn’t say something like:

oprofile: available events for CPU type “Core 2”

then there may be a problem with your Oprofile installation.

(3) Also check that the list of available events contains “L2_LINES_IN” and “MUL”. If (2) checks out OK, I’d be surprised if (3) didn’t.


> pgcollect -list-events | head -n 5
oprofile: available events for CPU type "Core 2"

See Intel Architecture Developer's Manual Volume 3, Appendix A and
Intel Architecture Optimization Reference Manual (730795-001)
> pgcollect -list-events | grep L2_LINES
L2_LINES_IN: (counter: all)
L2_LINES_OUT: (counter: all)
> pgcollect -list-events | grep MUL
MUL: (counter: all)

(4) Run opcontrol --version

> opcontrol --version
opcontrol: oprofile 0.9.4 compiled on Jul  9 2009 12:07:00

If there is anything else you want me to try, let me know. Oh, and FYI:

> cat /etc/redhat-release 
Red Hat Enterprise Linux Client release 5.4 (Tikanga)

Matt,

A couple more things:

(1) what’s the output of ‘pgcpuid’?

(2) If you try running the following script, we might get a more detailed error message. I was able to reproduce the behavior you are seeing when I ran on an old Athlon processor that didn’t support the specified events, but not when I try it here on a Core 2.

Replace “./myprog” with the program you are profiling. I would expect opcontrol to print some sort of error message. If you could run it twice, switching which of the ‘sudo opcontrol --event’ lines are commented out, that would be great.

#!/bin/sh
set -x

sudo opcontrol --setup --no-vmlinux
sudo opcontrol --init
#sudo opcontrol --event=CPU_CLK_UNHALTED:299200:0x00:0:1 --event=L2_LINES_IN:299200:0x70:0:1
sudo opcontrol --event=CPU_CLK_UNHALTED:299200:0x00:0:1 --event=MUL:2000
sudo opcontrol --reset
sudo opcontrol --start --separate=thread,library

./myprog

sudo opcontrol --dump
sudo opcontrol --shutdown


thanks
–Don

Sorry for the late reply. Ahh…surprises at work. On to opcontrol…



> pgcpuid 
vendor id       : GenuineIntel
model name      : Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cpu family      : 6
model           : 0
stepping        : 6
processor count : 2
clflush size    : 8
flags           : acpi apic cflush cmov cplds cx8 cx16 de dtes ferr fpu fxsr
flags           : ht lm mca mce mmx monitor msr mtrr pae pat pdcm pge pse
flags           : pseg36 selfsnoop speedstep sep sse sse2 sse3 ssse3 syscall
flags           : tm tm2 tsc vme xtpr
type            : -tp core2-64



(2) If you try running the following script, we might get a more detailed error message. I was able to reproduce the behavior you are seeing when I ran on an old Athlon processor that didn’t support the specified events, but not when I try it here on a Core 2.


> ./pgiop.sh 
+ sudo opcontrol --setup --no-vmlinux
+ sudo opcontrol --init
+ sudo opcontrol --event=CPU_CLK_UNHALTED:299200:0x00:0:1 --event=MUL:2000
Couldn't allocate hardware counters for the selected events.
+ sudo opcontrol --reset
+ sudo opcontrol --start --separate=thread,library
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
Profiler running.
+ ./runsorad-vector32.exe
Average time in kernel:     2842.328 +/-       11.597 ms.
+ sudo opcontrol --dump
+ sudo opcontrol --shutdown
Stopping profiling.
Killing daemon.

> ./pgiop.sh
+ sudo opcontrol --setup --no-vmlinux
+ sudo opcontrol --init
+ sudo opcontrol --event=CPU_CLK_UNHALTED:299200:0x00:0:1 --event=L2_LINES_IN:299200:0x70:0:1
Couldn't allocate hardware counters for the selected events.
+ sudo opcontrol --reset
+ sudo opcontrol --start --separate=thread,library
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
Profiler running.
+ ./runsorad-vector32.exe
Average time in kernel:     2847.079 +/-       10.350 ms.
+ sudo opcontrol --dump
+ sudo opcontrol --shutdown
Stopping profiling.
Killing daemon.

Unfortunately, looking at the opcontrol script, it doesn’t look like there’s a way to attach --verbose to the --event event.

Hi Matt,
It looks like there is some problem with your OProfile installation. The opcontrol commands are all pure OProfile. At this point it would probably be most productive for you to work with your IT folks to debug the OProfile installation on your system. Hopefully they will be able to resolve it and you’ll be up and running with pgcollect.
–Don