Some Perf tests are failing in Hikey960 tested on 4.9 kernel

There are some perf test failures noticed on Hikey960 and kernel used 4.9.20

  • Executed below command in Hikey960 target after perf compilation and placing it in target rootfs
    {{{
    / # perf test

    17: Test breakpoint overflow signal handler : FAILED!
    18: Test breakpoint overflow sampling : FAILED!

    }}}
  • Executed by --verbose option:
  • 1st case:
    {{{
    / # perf test -v 17
    17: Test breakpoint overflow signal handler :
    — start —
    test child forked, pid 1493
    count1 1, count2 11, count3 0, overflow 1, overflows_2 11
    failed: wrong overflow hit
    failed: wrong overflow_2 hit
    failed: wrong count for bp2
    failed: wrong count for bp3
    test child finished with -1
    ---- end ----
    Test breakpoint overflow signal handler: FAILED!
    / #
    }}}
  • In above case, when i analyse the code, the expected values are
    {{{

    return count1 == 1 && overflows == 3 && count2 == 3 && overflows_2 == 3 && count3 == 2 ?
    TEST_OK : TEST_FAIL;
    }}}

2nd case:
{{{
/ # perf test -v 18
18: Test breakpoint overflow sampling :
— start —
test child forked, pid 1495
count 10101, overflow 101
Wrong number of executions 10101 != 10000
Wrong number of overflows 101 != 100
test child finished with -1
---- end ----
Test breakpoint overflow sampling: FAILED!
/ #
}}}

  • In above case, the expected values as per ‘tools/perf/tests/bp_signal_overflow.c’
    {{{

    #define EXECUTIONS 10000
    #define THRESHOLD 100
    }}}

3rd case:

  • Also I cloned http://github.com/deater/perf_event_tests and compiled perf events “branches” test, but test got failed.
  • As per ‘perf_event_tests/validation/branches.c’ counter value should be 1500000, but I am getting from the hikey960 is “100340183”.

I analysed the code flow:
{{{
(tools/perf/tests/builtin-test.c) → test__bp_signal (tools/perf/tests/bp_signal.c)
→ bp_event → sys_perf_event_open (kernel/events/core.c)
→ perf_event_alloc → perf_init_event → pmu->event_init →
hw_breakpoint_event_init (kernel/events/hw_breakpoint.c) → register_perf_hw_breakpoint
→ validate_hw_breakpoint → arch_validate_hwbkpt_settings → monitor_mode_enabled
→ arch_build_bp_info (arch/arm64/kernel/hw_breakpoint.c)
}}}
I am not able to find where exactly values are getting increased.

Could you please share why the pmu counter and overflow values are getting mismatched or please suggest me how to move forward to get the correct values as per the code.

Thanks,
Murali

Hi @muralidhara_mk,

Thanks for reporting this issue, I tried at my side for kernel 4.16, I can confirm the testing 17/18 can pass on Hikey960:

root@linaro-developer:/mnt/arm64/linux-mainline/tools/perf/python# /mnt/arm64/linux-mainline/tools/perf/perf test -v 18                                                                                                            
18: 'import perf' in python                    :
--- start ---
test child forked, pid 6173
test child finished with 0
---- end ----
'import perf' in python: Ok
root@linaro-developer:/mnt/arm64/linux-mainline/tools/perf/python# /mnt/arm64/linux-mainline/tools/perf/perf test -v 17
17: Match and link multiple hists              :
--- start ---
test child forked, pid 6178
test child finished with 0
---- end ----
Match and link multiple hists: Ok

I don’t look into this, but I think the last Linux kernel code base is a good reference to compare with old kernel 4.9 and can get some hints the changes in perf tool, ARM PMU driver or Hi3660 PMU DT binding.

HI @leo-yan,

Thanks for looking into the above issue mentioned.

But, In 4.16 Kernel I observed, perf 17/18 tests are different than I explained in earlier comment and are
{{{
17: Match and link multiple hists :
18: ‘import perf’ in python
}}}

  • The above tests are passing in 4.9 kernel also.

In 4.9 Kenel, perf 17/18 tests are
{{{
17: Test breakpoint overflow signal handler : FAILED!
18: Test breakpoint overflow sampling : FAILED!

}}}

  • Also I confirmed same failures in 4.14 kernel also.

Could you please check out on above cases.

Thanks,
Murali

Thanks for correction, @muralidhara_mk.

I retested on my two boards: Hikey620 and Hikey960, both reports failure.

Hikey620 logs:

root@linaro-developer:/mnt/linux-kernel/linux/tools/perf# ./perf test -v 19
19: Breakpoint overflow signal handler :
— start —
test child forked, pid 2221
count1 1, count2 11, count3 0, overflow 1, overflows_2 11
failed: wrong overflow hit
failed: wrong overflow_2 hit
failed: wrong count for bp2
failed: wrong count for bp3
test child finished with -1
---- end ----
Breakpoint overflow signal handler: FAILED!
root@linaro-developer:/mnt/linux-kernel/linux/tools/perf# ./perf test -v 20
20: Breakpoint overflow sampling :
— start —
test child forked, pid 2223
count 10101, overflow 101
Wrong number of executions 10101 != 10000
Wrong number of overflows 101 != 100
test child finished with -1
---- end ----
Breakpoint overflow sampling: FAILED!

Hikey960 logs:

root@linaro-developer:/mnt/arm64/linux-mainline/tools/perf# ./perf test -v 19
19: Breakpoint overflow signal handler :
— start —
test child forked, pid 19972
count1 1, count2 11, count3 0, overflow 1, overflows_2 11
failed: wrong overflow hit
failed: wrong overflow_2 hit
failed: wrong count for bp2
failed: wrong count for bp3
test child finished with -1
---- end ----
Breakpoint overflow signal handler: FAILED!
root@linaro-developer:/mnt/arm64/linux-mainline/tools/perf# ./perf test -v 20
20: Breakpoint overflow sampling :
— start —
test child forked, pid 19974
count 10101, overflow 101
Wrong number of executions 10101 != 10000
Wrong number of overflows 101 != 100
test child finished with -1
---- end ----
Breakpoint overflow sampling: FAILED!

Thanks for reporting this, I really think this is a very interesting topic, I go through the testing case very roughly, and I suspect it’s related with ARM hardware breakpoint and this is a common issue for arm64 kernel. I will look into it as well in my free time.

At the meantime, I’d like to suggest you could send this question to the mailing list: linux-arm-kernel@lists.infradead.org, and linux-kernel@vger.kernel.org so can get a wider reviewing for this issue.

Hi @leo-yan,

Thanks for confirming the issue from your end in both Hikey620 and Hikey960 targets.

Thanks for your inputs and I will post the same issue in mailing list and Bugzilla.

Thanks,
Murali