Frequency switching stability issues on Q820

Hi,

With this change cpufreq-bench issue is resolved and didn’t observed crash while running it.

But boot hang issue is still there. Board sometime stuck at boot time and going into ram dump mode.

Thanks,
Hiren

Hi @Loic

It seems that policy->cur for CPU0 is set to 614400 KHz and for CPU2 its 19200 KHz which is outside of frequency table present with cpufreq core and thus CPU might go into unstable mode and boot freeze issue occurred .

I am trying to set policy->cur frequency to one which is available in frequency table. As i have seen some code it seems that this frequency is coming from perfcl_smux, pwrcl_smux or perfcl_pmux, pwrcl_pmux clock source.

So any pointer would be helpful regarding how i can change this frequency or is it appropriate to change this frequency or not?

Thanks,
Hiren

Good!

These freqs are set by the bootloader, but cpufreq is supposed to adjust the frequencies with the provided table.

Hi @Loic,

Is there QCOM CPR support is implemented for Q820?

I have checked kernel source 4.14.96 and found that QCOM CPR is supported for Q410 (drivers/power/avs/qcom-cpr.c). But it is not implemented for Q820.

So is there any possibility that we can improve boot freeze issue if we add QCOM CPR support in Q820?

Thanks,
Hiren

I appear to have gotten the Android kernel to boot reliably with a pull up to 4.14.165 from the Android common kernel. The issue I was having appeared to be tied to ufshcd-qcom.

I’ve done about 20 successful boots and have been unable so far to reproduce either the freeze on clock or the freeze on ufs initialization. I was getting a freeze up rate of about 75% (i.e., 3 failed boots to one successful), and of those, about 1/3 of them were on the frequency.

… although I did manage to break USB in the process.

https://gitlab.com/aosp-automotive/dragonboard-kernel-src/commits/wip/dragonboard-kernel-android-4.14

I recently discovered that UFS is the only client voting for LN_BB, which is used as reference clock for all PHYs. The result was that if UFS was probe deferred with any other PHY operational we would suffer a restart due to clock issues in e.g. PCI or USB.

I’ve gotten the below patch merged for v5.6 to fix this particular problem:
https://lore.kernel.org/linux-arm-msm/20200106080546.3192125-1-bjorn.andersson@linaro.org/

Regards,
Bjorn

No, we unfortunately do not have a driver for the version of CPR found in 820, yet. We also need a regulator driver for the in-core power controller found on this SoC.

What qcom-cpr.c does give us is a conclusion on how CPR should be integrated in the system, so now it’s just a matter of writing the code :wink:

Regards,
Bjorn

Hi @bamse ; that’s great. I’ll try to pull in that patch.

As a matter of fact, I’ve also been trying to get a more recent kernel to run on the board (genuine db820c limited “brown mask” edition), but I’ve been unable to get drm-msm working.

I have started another thread on the issue here; DRM MSM on recent *mainline* kernels (5.4+) – wonder if you could recommend a good starting point?

EDIT: Looks like while the patch can be applied to 4.14, it doesn’t actually compile. Lots changed since then.

I have a question for you @Loic :
The bootloader says this;
B - 415166 - 8996 Pro v1.x detected, Max frequency = 1.8 GHz

But the DTS goes all the way up to 2.15 GHz.
Is this overclocking the chip? Or is the bootloader just using an old string?

Well msm8996 is supposed to have the big core clocked at up to 2.16Ghz (Turbo), and AFAIK only the MSM8996 Lite version is limited to 1.8Ghz.

But here not sure if this frequency really refers to the CPU cores (I don’t have bootloader source code), maybe it’s just the memory frequency (LPDDR4 SDRAM at 1866 MHz clock rate).

on my side, I also get 1.8Ghz:

B -    568001 - 8996 v3.x detected, Max frequency = 1.8 GHz

Whereas my SoC (APQ8096 3.1.3 on the PCB) is clearly identified supporting Kryo Gold 2.15
GHz in the Qualcomm® SnapdragonTM 820E Processor device specification [1] (4.3).

[1] https://developer.qualcomm.com/download/sd820e/qualcomm-snapdragon-820e-processor-apq8096sge-device-specification.pdf

Yes, it is DDR frequency.

Can we port CPR driver based on Android CPR driver? I found that in Android kernel drivers/regulator/cpr3-hmss-regulator.c driver is used for CPR.

Use kernel 5.4.

All stability problems are fixed there.

Yes, that is possible, but in addition to the CPR doing adaptive voltage scaling we also need a regulator driver for the regulators found in the CPU subsystem, and the necessary clock code. But combined these should do the trick.

Regards,
Bjorn

Is there any updates regarding this issue on 4.14 kernel?
We tried updating to kernel 5.4 on 820 uSOM custom board, but the problem is there is a lot of custom patches in BSP and there are a lot of conflicting changes. We merged the custom patches on kernel 5.4 best to our knowledge, but the kernel crashes on boot. If anyone can direct us on the right path regarding the updating to 5.4.

Start with my 5.4 kernel here, which boots:

Then implement and test your changes one at a time.

We cloned this 5.4 kernel, put it inside the BSP folder, changed some device tree nodes in

arch/arm64/boot/dts/qcom/apq8096-db820c-pmic-pins.dtsi
arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi
arch/arm64/boot/dts/qcom/msm8996-pins.dtsi
arch/arm64/boot/dts/qcom/msm8996.dtsi
arch/arm64/boot/dts/qcom/pmi8994.dtsi

and it builds without errors. However, when we flash the boot image and rootfs, we get this error on boot:

[830] ERROR: Unable to find suitable device tree for device (291/0x00030001/0x1c01000a/28)␍␊
[840] ERROR: Getting device tree address failed␍␊
[840] ERROR: Could not do normal boot. Reverting to fastboot mode.␍␊

Any idea how to solve this? Did we skip some steps?

I can’t help you if you’re mucking with a heap of vendor junk. Lose this “bsp”.

When you “changed some device tree nodes in”… what changes? I asked you to first build a working kernel and then “test your changes one at a time”, not go in and break everything on the first shot and wonder why it isn’t working.

What did you do to compile the kernel and attach the dtb to it?

I don’t have any previous experience with dragonboards or deployment on such SoMs that is why we took the BSP from the vendor and just replaced it with your version of 5.4 kernel. BSP (script) pulls down the specific kernel and little kernel together with the built linaro toolchain, skales, deployment tools (fastboot) and other stuff. It also applies some custom patches to the kernel (some drivers, but mainly device tree) and lk.

Can we use the same tools (toolchain, lk, skales …) and just replace the linaro boot image that is then flashed on the uSoM. Do we use the default arm64 kconfig or the one from the vendor? Any other steps we are missing?

I think little kernel is checking the ID of the hardware vs the device ID present in the DTS, so you may need to keep the same “qcom,msm-id =” as in your original devicetree.

You were right, I added this

--- a/arch/arm64/boot/dts/qcom/apq8096-db820c.dts
+++ b/arch/arm64/boot/dts/qcom/apq8096-db820c.dts
@@ -10,4 +10,6 @@
 / {
        model = "Qualcomm Technologies, Inc. DB820c";
        compatible = "arrow,apq8096-db820c", "qcom,apq8096-sbc", "qcom,apq8096";
+       qcom,msm-id = <291 0x30001>;
+       qcom,board-id = <10 28>;
 };

and it started to work with release/qcomlt-5.4.