ARMv8 PMU is not availablea

Hello,
We are trying to run Nsight System profiling on our Jetson AGX Orin Developer Kit, but we are encountering the following error:

WARNING: ARMv8 PMU is not available, enabling sampling-trigger=perf switch, software events will be used for CPU sampling.

CUDA drivers version: 11.4
Tegra release version: # R35 (release), REVISION: 4.1, GCID: 33958178, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 1 19:57:35 UTC 2023
Nsight system CLI: 2022.2.1

How can we enable the PMU?
If anyone can help me I would really appreciate it.

Hello,

Thanks for visiting the NVIDIA Developer forums! Your topic will be best served in the Jetson category.

I will move this post over for visibility.

Cheers,
Tom

I think this may have been transferred too soon.

I am pretty sure that CPU side PMUs have not been enabled in Nsight Systems for this platform, but I am going to defer to @pkovalenko for confirmation.

Hi @pavelg,

NSys uses hardware ARMv8 PMU events by default, if possible.
Looks like for some reason hardware PMU events are disabled in Device Tree of your BSP/JetPack.

Can you please try the following commands:

cat /proc/device-tree/arm-pmu/{name,compatible,status}
ll /sys/bus/event_source/devices/

For example, when PMU is enabled:

cat /proc/device-tree/arm-pmu/{name,compatible,status}
arm-pmu
arm,armv8-pmuv3
okay                <--- `disabled` means that ARMv8 PMU is disabled in BSP image.
ll /sys/bus/event_source/devices/
...
lrwxrwxrwx 1 root root 0 Mar 15 15:04 armv8_pmuv3 -> ../../../devices/armv8_pmuv3/
...

List of hardware PMU events:

ll /sys/bus/event_source/devices/armv8_pmuv3/events/
-r--r--r-- 1 root root 4096 Mar 15 15:06 br_mis_pred
-r--r--r-- 1 root root 4096 Mar 15 15:06 br_mis_pred_retired
-r--r--r-- 1 root root 4096 Mar 15 15:06 br_pred
-r--r--r-- 1 root root 4096 Mar 15 15:06 br_retired
-r--r--r-- 1 root root 4096 Mar 15 15:06 bus_access
...

Also, if both hardware and software events are available, you can switch between them - there is SampleOnHardwareEvent config.ini option, something like this:

NSYS_CONFIG_DIRECTIVES='SampleOnHardwareEvent=<false|true>' nsys profile <app>

or for older NSys:

NSYS_CONFIG_DIRECTIVES='SampleOnHardwareEvent=<false|true>' nsys profile --sampling-trigger=perf <app>

Report’s Diagnostic Summary - you can see which events were used:
Software:

Event 'CPU Clock (sw)', with sampling period 2201600, used to trigger process-tree CPU IP sample collection.

Hardware:

Event 'CPU Cycles', with sampling period 2201600, used to trigger process-tree CPU IP sample collection.

Thanks,
Igor.

Hi @inabirushkin,

Thank you for the detailed response. Yes, in theory the PMU should work, but… Or am I missing something?

user@software-1:~$ cat /proc/device-tree/arm-pmu/name 
arm-pmu
user@software-1:~$ cat /proc/device-tree/arm-pmu/compatible 
arm,armv8-pmuv3
user@software-1:~$ cat /proc/device-tree/arm-pmu/status 
okay
user@software-1:~$ ll /sys/bus/event_source/devices/ | grep armv8
lrwxrwxrwx 1 root root 0 Nov 21  2023 armv8_pmuv3 -> ../../../devices/armv8_pmuv3/
user@software-1:~$ ll /sys/bus/event_source/devices/armv8_pmuv3/events/
total 0
drwxr-xr-x 2 root root    0 Sep  7 06:17 ./
drwxr-xr-x 6 root root    0 Nov 21  2023 ../
-r--r--r-- 1 root root 4096 Sep  9 12:41 br_mis_pred
-r--r--r-- 1 root root 4096 Sep  9 12:41 br_mis_pred_retired
-r--r--r-- 1 root root 4096 Sep  9 12:41 br_pred
-r--r--r-- 1 root root 4096 Sep  9 12:41 br_retired
...
-r--r--r-- 1 root root 4096 Sep  9 12:41 ttbr_write_retired

And than:

user@software-1: ~$ NSYS_CONFIG_DIRECTIVES='SampleOnHardwareEvent=true' /opt/nvidia/nsight-systems-cli/2022.2.1/bin/nsys profile --trace=cuda --output=/home/user/profiling/visionservice_4.qdrep vs.sh
WARNING: ARMv8 PMU is not available, enabling `sampling-trigger=perf` switch, software events will be used for CPU sampling.

What could be the cause of this behavior?

Thanks,
Pavel

Hi @pavelg,

Yes, PMU is OK.

user@software-1: ~$ NSYS_CONFIG_DIRECTIVES='SampleOnHardwareEvent=true' /opt/nvidia/nsight-systems-cli/2022.2.1/bin/nsys profile --trace=cuda --output=/home/user/profiling/visionservice_4.qdrep vs.sh
WARNING: ARMv8 PMU is not available, enabling `sampling-trigger=perf` switch, software events will be used for CPU sampling.

This warning message is not exactly accurate (and ARMv8 PMU is actually available), it is just enabling sampling-trigger=perf mode, but then you can still change the type of events, hardware (by default) / software.
visionservice_4.qdrep - please check the type of events in Diagnostic Summary, hardware or software?

Most likely --sampling-trigger=perf will be enough for hardware PMU events, please try:

nsys profile --sampling-trigger=perf ...

or

NSYS_CONFIG_DIRECTIVES='SampleOnHardwareEvent=true' nsys profile --sampling-trigger=perf ...

Please note that 2022.2.1 NSys version is rather obsolete…
You can try to use some up-to-date NSys, for example 2024.5.XXX (--sampling-trigger=perf - it is by default now).

Hardware PMU events (2024.5.XXX):

nsys profile ...

Software events (2024.5.XXX):

NSYS_CONFIG_DIRECTIVES='SampleOnHardwareEvent=false' nsys profile ...

Then please check the type of events in Diagnostic Summary.

Thanks,
Igor.