I am a bit surprised - I was counting on the kernel not to be controlling the PMU so you -as a user- could access its registers from user space instead of using the recommended way (ie perf).
Could you validate perf in your #36 release?
If you are cross-compiling your kernel you could just do
server$ cd $KERNEL_TREE
server$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- tools/perf
then copy tools/perf/perf to your db410c.
Once done please execute as follows:
- terminal_1: keep perf busy using a high count
db410c$ perf stat -e cycles dd if=/dev/zero of=/dev/null count=10000000
*terminal_2: check if perf is receiving interrupts
db410c$ cat /proc/interrupts | grep arm-pmu
If perf is indeed working on #36 (which I doubt since I would expect perf to interfere with your module) you should see an “arm-pmu” interrupt registered in terminal 2 (while terminal_1 is busy).
You could then reduce the count given on terminal_1 to get a valid output quicker on terminal 1.
root@linaro-alip:/home/linaro# perf stat -e cycles dd if=/dev/zero of=/dev/null count=10000
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.0816867 s, 62.7 MB/s
Performance counter stats for 'dd if=/dev/zero of=/dev/null count=10000':
17987875 cycles
0.092714040 seconds time elapsed
In any case, maybe you should try to recompile the #144 kernel without PMU support (remove the device tree entry for instance or disable it from the kernel config); I would then expect userspace to have complete access to the hardware and not collide with the kernel PMU initialization.
Either way my advice would be to drop your current performance measurement system and write some macros/functions around trace/perf (a first class Linux subsystem) and then use this tool to access the PMU registers that you need.