Performance monitoring unit (PMU)


I’d like to know if the performance monitoring unit (PMU) is fully-implemented in the HiKey’s ARM processor. More specifically, I’m looking for the PMU base address to be able to monitor event counters (architectural or micro-architectural counters).

In the Cortex-A53 technical reference manual, section 12.7 “Memory-mapped register summary” lists the memory-mapped registers of the PMU. However, only offsets are given with no base address. I inferred that the base address is implementation defined.

Thanks in advance,

I have a dragonboard 410c rather than the HiKey board. However, on the dragonboard 410c I have been able to access the pmu using the linux perf command:

$ perf stat true
failed to read counter stalled-cycles-frontend
failed to read counter stalled-cycles-backend
failed to read counter branches

Performance counter stats for ‘true’:

      1.899635      task-clock (msec)         #    0.074 CPUs utilized      
             2      context-switches          #    0.001 M/sec              
             0      cpu-migrations            #    0.000 K/sec              
            27      page-faults               #    0.014 M/sec              
     2,256,371      cycles                    #    1.188 GHz                

<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
836,772 instructions # 0.37 insns per cycle
<not supported> branches
12,914 branch-misses # 0.00% of all branches

   0.025573807 seconds time elapsed                                         

Since the HiKey board is also Cortex A53 based I would image it is likely to work the same. It will mainly depend on whether the kernel has the perf infrastructure enabled. The kernel config should have:


When the kernel boots up should see something like the following in the console log output:

hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 7 counters available

Thanks for your answer.

What happens is that the 3.18.0 kernel released with the hikey board seems to not support ‘hw perfevents’. Then, I tried to implement a kernel module to enable PMU management in userspace, and APIs to use in my code.

Given that I only want to count the number of committed instructions (independent from core implementation), I’m using the ARM Foundation model and the ‘perf’ command line.

PS: my PMU module and APIs work in the ARM Foundation model, but I got a lot of instability in the hikey board (i.e., unusable counters). I’m not aware of the reason.

Fernando A. Endo

The kernels I have been using on the dragonboard 410c are much newer, 4.2.0 and 4.3.0+. You should verify hikey kernel was compiled with the perf support.

In the past with 32-bit arm machines the perf support didn’t work because the device tree was missing information about the performance monitoring hardware (741325 – ARM fc14 kernels does not provide hardware perf counter support).

For the dragonboard 410c there is the following in arch/arm64/boot/dtbs/qcom/msm8916.dtsi (and similar descriptions other arm64 device tree files):

    cpu-pmu {                                                               
            compatible = "arm,armv8-pmuv3";                                 
            interrupts = &lt;GIC_PPI 7 GIC_CPU_MASK_SIMPLE(4)&gt;;                

There doesn’t seem to be a similar description for the hisilicon device trees. Does the hikey boards use the device tree?

We merged the support for PMU events and perf some time ago. At the time, we did some initial validation but please feel free to report any issues you might encounter.

by the way, accessing the PMU counters from user space via mmap is not recommended (or supported IIRC). If that is what you want to do I believe you’ll have to do some kernel work.

I’d recommend you to use the usual perf infrastructure.