Performance monitoring unit (PMU)


#1

Hi,

I’d like to know if the performance monitoring unit (PMU) is fully-implemented in the HiKey’s ARM processor. More specifically, I’m looking for the PMU base address to be able to monitor event counters (architectural or micro-architectural counters).

In the Cortex-A53 technical reference manual, section 12.7 “Memory-mapped register summary” lists the memory-mapped registers of the PMU. However, only offsets are given with no base address. I inferred that the base address is implementation defined.

Thanks in advance,
Fernando


#2

I have a dragonboard 410c rather than the HiKey board. However, on the dragonboard 410c I have been able to access the pmu using the linux perf command:

$ perf stat true
failed to read counter stalled-cycles-frontend
failed to read counter stalled-cycles-backend
failed to read counter branches

Performance counter stats for ‘true’:

      1.899635      task-clock (msec)         #    0.074 CPUs utilized      
             2      context-switches          #    0.001 M/sec              
             0      cpu-migrations            #    0.000 K/sec              
            27      page-faults               #    0.014 M/sec              
     2,256,371      cycles                    #    1.188 GHz                

<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
836,772 instructions # 0.37 insns per cycle
<not supported> branches
12,914 branch-misses # 0.00% of all branches

   0.025573807 seconds time elapsed                                         

Since the HiKey board is also Cortex A53 based I would image it is likely to work the same. It will mainly depend on whether the kernel has the perf infrastructure enabled. The kernel config should have:

CONFIG_HAVE_PERF_EVENTS=y
CONFIG_PERF_EVENTS=y
CONFIG_HW_PERF_EVENTS=y

When the kernel boots up should see something like the following in the console log output:

hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 7 counters available


#3

Thanks for your answer.

What happens is that the 3.18.0 kernel released with the hikey board seems to not support ‘hw perfevents’. Then, I tried to implement a kernel module to enable PMU management in userspace, and APIs to use in my code.

Given that I only want to count the number of committed instructions (independent from core implementation), I’m using the ARM Foundation model and the ‘perf’ command line.

PS: my PMU module and APIs work in the ARM Foundation model, but I got a lot of instability in the hikey board (i.e., unusable counters). I’m not aware of the reason.

Regards,
Fernando A. Endo


#4

The kernels I have been using on the dragonboard 410c are much newer, 4.2.0 and 4.3.0+. You should verify hikey kernel was compiled with the perf support.

In the past with 32-bit arm machines the perf support didn’t work because the device tree was missing information about the performance monitoring hardware (https://bugzilla.redhat.com/show_bug.cgi?id=741325).

For the dragonboard 410c there is the following in arch/arm64/boot/dtbs/qcom/msm8916.dtsi (and similar descriptions other arm64 device tree files):

    cpu-pmu {                                                               
            compatible = "arm,armv8-pmuv3";                                 
            interrupts = &lt;GIC_PPI 7 GIC_CPU_MASK_SIMPLE(4)&gt;;                
    };

There doesn’t seem to be a similar description for the hisilicon device trees. Does the hikey boards use the device tree?


#5

We merged the support for PMU events and perf some time ago. At the time, we did some initial validation but please feel free to report any issues you might encounter.


#6

by the way, accessing the PMU counters from user space via mmap is not recommended (or supported IIRC). If that is what you want to do I believe you’ll have to do some kernel work.

I’d recommend you to use the usual perf infrastructure.