I see, I (again) forgot that -mfpu=neon
isn’t needed with aarch64 since GCC default here anyway.
I still find the performance numbers somewhat underwhelming but it looks like not worth the efforts to dig deeper (at least ‘remotely’). If I would sit in front of the machine I would further test whether there are some budget cooling strategies active (testing with different core counts with throttling active (no fan, maybe even no heatsink).
But I think I better spend the time on playing with server grade SoCs instead (Armada 8040 coming to my mind). At least it was interesting! Thanks for providing the numbers!
Edit: Also interesting to explore specific throttling behaviour especially when a ‘MCU firmware’ is also involved. I played around with a rather beefy octa-core A53 device last year. Throttling there starts at 85°C and is implemented pretty stupid: Not dynamically downclocking but always down to a lower cpufreq (800 MHz here). So when testing I discovered that I got way better results limiting max cpufreq to 1300 MHz than allowing the maximum 1400 MHz (since then performance dropped way lower with the CPU cores constantly jumping between 1400 and 800 MHz instead of throttling trying to explore 1.3 and 1.2Ghz). Maybe somewhere here: https://forum.armbian.com/index.php?/topic/1285-nanopi-m3-cheap-8-core-35/&do=findComment&comment=13803 (just did a quick forum search).