Comparison between Hikey960 and Hikey970


Recently, I am trying to compare performance of GPUs between Hikey960 and Hikey970.
The result is strange to me. Any idea about it?

- TinkerBoard HIkey960 Hikey970
alexnet 135 31 31
googlenet 224 46 52
inception_v3 802 177 195
inception_v4 β€” 542 525
lenet 8 9 14
mobilenet 124 21 32
mobilenet_qasymm8 121 26 178
resnet50 546 96 112
resnext50 907 221 252
squeezenet 132 23 159
squeezenet_v1_1 70 19 110
vgg16 887 180 188
vgg19 1069 244 232

All test-cases are the examples of Every value is msec and minimum in 100 times exectuion.
Hikey960 and Hikey970 ran on AOSP, TinkerBoard ran on TinkerOS using OpenCL.

It is understandable for me that Mali-G71 in Hikey960 is much faster then Mali-T764 in TinkerBoard. But Mali-G72 in Hikey970 is equal or slower than Mali-G71 in Hikey960.

It may be just a problem of software tuning.
I expected that Mali-G72 is faster because of the number of shader core.
Does anybody have idea why such result come from?


Hi no_maddo, I have found a significant performance regression (and its cause!) with ArmCL v18.05 compared to ArmCL v18.03, especially for 1x1 convolutions used e.g. in SqueezeNet and MobileNets. As your results show a big slowdown between HiKey960 and HiKey970 especially for these models, I am wondering if it’s due to the different library versions deployed on your platforms?


Thanks. In my case, I only used v18.05.


@no_maddo Could you please share how you built compute library on Hikey 970? Have you tried any tensorflow Models?