Reset/Reboot behavior

custom_board

#1

Hi,

we have several Variscite SD410 modules. I’m not sure if this is also an issue on Dragonboard.

When executing ‘reboot’ cammand on Linux some modules reboot fine and other do not using same software version. Kernel is 4.14 used in Yocto Rocko.

The call stack in kernel is:

machine_restart
arm_pm_restart (mapped to ‘psci_sys_reset’)
psci_sys_reset
invoke_psci_fn (with parmeter ‘PSCI_0_2_FN_SYSTEM_RESET’, mapped to ‘__invoke_psci_fn_smc’)
__invoke_psci_fn_smc
arm_smccc_smc (assember, doing: “SMCCC smc”) // don’t know what that is
===> HANG

Because the PM8916 is the only one who controls RESIN_N from APQ8016 directly I assume someone has to call for a HARD_RESET on PM8916 through SPMI after the above sequencue.

And because RPM driver is not involved in the reset sequence I bet this mus be done by the CORTEX A53 running the reset sequence. It seems to be a subsystem called SMC ( Secure Monitor Call?).

Is this a known issue?
Can anybody confirm that my asumptions are OK?
Can anybody tell me how the PM8916 is triggered exactly?
What could be the reason that the system hangs?

I can’t observe the SPMI bus since we do not have schematics/information of test points to read the bus signals.

In Android kernel I found a direct access to the PMIC through SPMI:

So it seems to be at leas a possibility to use this as workarround.

I also found that there is the msm-poweroff driver compiled in and enabled by device tree.
This driver writes a ‘0’ into a ‘ps-hold’ register:

static int do_msm_restart(struct notifier_block *nb, unsigned long action,
		   void *data)
{
    writel(0, msm_ps_hold);
    mdelay(10000);

    return NOTIFY_DONE;
}

However, the reset notifier is never called due to the fact that it is registered as
restart handler to be called during ‘do_kernel_restart(…)’.
Bit since psci firmware driver is enabled ‘do_kernel_restart(…)’ is never called because psci sets ‘arm_pm_restart’ and:

void machine_restart(char *cmd)
{
...
    /* Now call the architecture specific reboot code. */
    if (arm_pm_restart)
	    arm_pm_restart(reboot_mode, cmd);
    else 
	    do_kernel_restart(cmd); // <- NEVER CALLED 
...
}

Further, registering ‘do_msm_restart’ as reboot notifier instead in msm-poweroff driver allows the
this handler to be called before ‘arm_pm_restart’. But it the kernel hangs also.

Best regards
-Carsten


#2

I don’t think this is a known issue for DB410C (its not the bug database and I don’t recall anything similar on the forums to date).

PSCI is a trap based interface. SMC is the means to trapping into the system firmware to provoke the reset. It directs the system firmware (on C-A53) to start a reset (and it will almost certainly be implemented by asking the PMIC to do it).

It is possible that you have different system firmware running on your different boards? That would certainly explain different behaviour board to board.

To be honest though, even having asked that, whatever you reply I can’t help all that much. You probably need to discuss this with the board vendor.


#3

So, what you say is that one of the bootloaders reset the PMIC? Or is that rather the TrustZone software?


#4

I’m not sure what you mean. Once a the system is fully running then bootloading is complete and I’d call anything still resident after this point firmware.

I’m afraid I have no idea where Qualcomm has implemented PCSI. Normally PSCI is implemented in Arm Trusted Firmware (running in EL3, which straddles the secure and normal worlds) and allows the non-secure hypervisor exception level (EL2) to be used for virtualization (e.g. KVM). However Qualcomm’s system firmware reserves the hypervisor space for its own use and this allows a great deal of choice about where it implements various features, including PSCI.


#5

I know that bootloader code just runs on boot. Your first answer sounds to me like that a reboot is triggered and bootloader resets the PMIC on reboot, resetting the system entirely.

The term ‘firmware’ means to me all that different firmware files shipped with Drangonboard that are the two bootloaders sbl1 (Second Level Bootloader), aboot (Little Kernel), tz (TrustZone, that may be the traped code you refer to), …

Therefore I asked which one of them are implementing the code the system traps into.


#6

An smc can be intercepted by both EL2 and EL3 so it may or may not be delivered to S-EL1 (tz) so I don’t think we can reason (assuming you don’t have sight of Qualcomm internal documentation or source code) about where the code is implemented.


#7

Perhaps I could be less dogmatic.

The basic assumption I adopt is that all the closed source components should be treated as one “system firmware” and that it is not safe to mix 'n match the components. So whilst I was vague in my original reply, that vagueness was deliberate.


#8

Thanks for your answer. At the end we found that JTAG_PS_HOLD was pulled to high by our MOSFET together with a missing resistor. On VAR-SD410 Varsicite eval board this is Q1/R3. R3 isn’t also assembled but their MOSFET type doesn’t seem to float.

That’s all about the problem. No firmware issue.

But to complete the above disscussion let me just answer on the reset sequence which may be important information for anyone else:

As already stated above kernel does this:

You were right. Processor traps to EL3 as described here:

I have two aditional Qualcomm NDA documents and code from RPM that tell me which firmware
part runs on EL3 and how the several cores interact and how they are connected
to PMIC. I can’t give that info here. One has to ask her/his distriubutor.

BTW, Arrow Europe support is very bad. You have to ask several times until you get an answer.

Best regards
-Carsten