PCIe issue after kexec

Recently I have been able to return to my investigations of using kexec on the Dragonboard 820c.
As the time interval from my last post in Kexec hang still not resolved - #14 by kldixon is so long, I thought I should start a new thread.

./Documentation/PCI/pci.rst opens with:
“The world of PCI is vast and full of (mostly unpleasant) surprises.”
I have come to appreciate, all too well, the accuracy of that statement.:slight_smile:

I put the following pr_debugs into ./drivers/pci/controller/dwc/pcie-designware-host.c:dw_pcie_access_other_conf

static int dw_pcie_access_other_conf(struct pcie_port *pp, struct pci_bus *bus,
				     u32 devfn, int where, int size, u32 *val,
				     bool write)
{
	int ret, type;
	u32 busdev, cfg_size;
	u64 cpu_addr;
	void __iomem *va_cfg_base;
	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);

	pr_debug("%s 0 %d %d %d\n", __func__, bus->parent->number, pp->root_bus_nr, where);
	busdev = PCIE_ATU_BUS(bus->number) | PCIE_ATU_DEV(PCI_SLOT(devfn)) |
		 PCIE_ATU_FUNC(PCI_FUNC(devfn));

	if (bus->parent->number == pp->root_bus_nr) {
		type = PCIE_ATU_TYPE_CFG0;
		cpu_addr = pp->cfg0_base;
		cfg_size = pp->cfg0_size;
		va_cfg_base = pp->va_cfg0_base;
	} else {
		type = PCIE_ATU_TYPE_CFG1;
		cpu_addr = pp->cfg1_base;
		cfg_size = pp->cfg1_size;
		va_cfg_base = pp->va_cfg1_base;
	}

	pr_debug("%s 1 %d %LX %x %x %px\n", __func__, type, cpu_addr, busdev, cfg_size, va_cfg_base);
	dw_pcie_prog_outbound_atu(pci, PCIE_ATU_REGION_INDEX1,
				  type, cpu_addr,
				  busdev, cfg_size);
	if (write)
		ret = dw_pcie_write(va_cfg_base + where, size, *val);
	else
		ret = dw_pcie_read(va_cfg_base + where, size, val);

	pr_debug("%s 2 %x %d %LX %llx %x\n", __func__, *val, pci->num_viewport, pp->io_base, pp->io_bus_addr, pp->io_size);
	if (pci->num_viewport <= 2)
		dw_pcie_prog_outbound_atu(pci, PCIE_ATU_REGION_INDEX1,
					  PCIE_ATU_TYPE_IO, pp->io_base,
					  pp->io_bus_addr, pp->io_size);

	return ret;
}

Note the the value of ‘write’ is false and ‘size’ is 4, as dw_pcie_access_other_conf is called via pci_bus_read_config_dword.
Also note:

./drivers/pci/controller/dwc/pcie-designware.h:
#define PCIE_ATU_TYPE_CFG0		0x4

and bus->number = 1 and devfn = 0.
The result was, for the first kernel:

[   13.788446] dw_pcie_access_other_conf 0 0 0 0
[   13.792185] dw_pcie_access_other_conf 1 4 D100000 1000000 80000 ffff800013300000
[   13.796741] dw_pcie_access_other_conf 2 6121b21 2 D200000 d200000 100000

[   26.610043] dw_pcie_access_other_conf 0 0 0 0
[   26.613783] dw_pcie_access_other_conf 1 4 E100000 1000000 80000 ffff800013500000
[   26.618342] dw_pcie_access_other_conf 2 10831969 2 E200000 e200000 100000

[   46.934350] dw_pcie_access_other_conf 0 0 0 0
[   46.938093] dw_pcie_access_other_conf 1 4 C100000 1000000 80000 ffff800014880000
[   46.942660] dw_pcie_access_other_conf 2 3e168c 2 C200000 c200000 100000

and for the second, (kexec’d), kernel:

[   13.583168] dw_pcie_access_other_conf 0 0 0 0
[   13.586907] dw_pcie_access_other_conf 1 4 D100000 1000000 80000 ffff800013300000
[   13.591458] dw_pcie_access_other_conf 2 ffffffff 2 D200000 d200000 100000

[   19.668798] dw_pcie_access_other_conf 0 0 0 0
[   19.672537] dw_pcie_access_other_conf 1 4 E100000 1000000 80000 ffff800013500000
[   19.677091] dw_pcie_access_other_conf 2 ffffffff 2 E200000 e200000 100000

[   30.819566] dw_pcie_access_other_conf 0 0 0 0
[   30.823305] dw_pcie_access_other_conf 1 4 C100000 1000000 80000 ffff800013e80000
[   30.827849] dw_pcie_access_other_conf 2 3e168c 2 C200000 c200000 100000

with
root@linaro-alip:~# lspci -vvvxxxx
producing the following lines showing the three PCIe endpoints and the first 16 bytes of their corresponding configuration spaces:

...
0000:01:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
...
00: 8c 16 3e 00 06 04 10 00 32 00 80 02 00 00 00 00
...
0001:01:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01) (prog-if 01 [AHCI 1.0])
...
00: 21 1b 12 06 07 04 10 00 01 01 06 01 00 00 00 00
...
0002:01:00.0 Ethernet controller: Qualcomm Atheros AR8151 v2.0 Gigabit Ethernet (rev c0)
...
00: 69 19 83 10 07 00 10 00 c0 00 00 02 00 00 00 00
...

The virtual addresses, va_cfg_base, vary from boot instance to boot instance.
The device tree ‘reg’ entries for the configuration address spaces of the pcie nodes in ./arch/arm64/boot/dts/qcom/msm8998.dtsi are:

...
reg = <0x0c100000 0x100000>;
...
reg = <0x0d100000 0x100000>;
...
reg = <0x0e100000 0x100000>;

In ./drivers/pci/controller/dwc/pcie-designware-host.c:dw_pcie_host_init, cfg0_size is set by:

	cfg_res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "config");
	if (cfg_res) {
		pp->cfg0_size = resource_size(cfg_res) >> 1;

and is then used to set va_cfg0_base:

	if (!pp->va_cfg0_base) {
		pp->va_cfg0_base = devm_pci_remap_cfgspace(dev,
					pp->cfg0_base, pp->cfg0_size);

So everything looks ok except for the the value of ffffffff being read from the first four bytes of the configuration space of the ASM1062 and AR8151 in the second kernel.

One difference between the QCA6174 and the other two endpoints is that the configuration space of the QCA6174 is read after ath10k_pci_init, as the probe is deferred, but the configuration spaces of the other endpoints are read before ahci_pci_driver_init and atl1c_driver_init.
Another difference is that the 4k of bridge memory for the QCA6174 is below the endpoint memory but above the endpoint memory for the ASM1062 and AR8151.
And yet another difference is that ‘I/O behind bridge’ reported by lspci is disabled for the QCA6174 but not for the other two endpoints.

Even a fundamental warm reset fails to make any difference.
I tried including the following shutdown method in qcom_pcie_driver:

static void qcom_pcie_shutdown(struct platform_device *pdev)
{
	struct qcom_pcie *pcie = platform_get_drvdata(pdev);
	struct dw_pcie *pci = pcie->pci;
	struct pcie_port *pp = &pci->pp;
	int ret;
        ret = qcom_pcie_host_init(pp);
}

I put pr_debugs in qcom_pcie_host_init, but, although it reported as expected and returned 0, it made no difference.
As I understand it, this gives a fundamental reset to the endpoint and sets up the root complex port via dw_pcie_setup_rc.
I have also tried using ahci_remove_one and atl1c_remove instead of ahci_shutdown_one and atl1c_shutdown as shutdown methods but, again, no difference.

Is it more likely that the ATU is misbehaving? If so, what can I do about it?
Or is the firmware interfering? As far as I can see, LK does not touch PCI but perhaps xbl.elf, tz.mbn, or rpm.mbn does?
I also do not understand why PCIE_ATU_TYPE_IO is being programmed? Why not PCIE_ATU_TYPE_MEM as well?

I think I have answered my last question. It is because the number of view ports is 2, but the number of PCIe address spaces is 4, the three PCI address spaces, Memory, I/O and Configuration plus Message space.

Currently, only three indices are defined in ./drivers/pci/controller/dwc/pcie-designware.h. Does that imply that that is the greatest number of view ports any implementation has?

./drivers/pci/controller/dwc/pcie-designware-host.c:dw_pcie_setup_rc sets PCIE_ATU_REGION_INDEX0 to PCIE_ATU_TYPE_MEM and, only if pci->num_viewport > 2, sets PCIE_ATU_REGION_INDEX2 to PCIE_ATU_TYPE_IO. So, ./drivers/pci/controller/dwc/pcie-designware-host.c:dw_pcie_access_other_conf sets PCIE_ATU_REGION_INDEX1 to PCIE_ATU_TYPE_CFG0 temporarily and reinstates PCIE_ATU_REGION_INDEX1 to PCIE_ATU_TYPE_IO if pci->num_viewport <= 2.

Could the following commit be a clue? The root port that is working is the only one for which the ‘I/O behind bridge’ is disabled.

commit 68a0bfec72cb4f117198ae31df114dad4c5e405d
Author: Dong Bo <dongbo4@huawei.com>
Date:   Mon Jul 4 21:44:43 2016 +0530

    PCI: designware: Exchange viewport of `MEMORYs' and `CFGs/IOs'
    
    When we have only two view ports in a DesignWare PCIe platform, iatu0
    is used for both CFG and IO accesses.  When CFGs are sent to peripherals
    (e.g., lspci), iatu0 frequently switches between CFG and IO.
    
    For such scenarios, a MEMORY might be sent as an IOs by mistake.
    Considering the following configurations:
    
      MEMORY  ->   BASE_ADDR: 0xb4100000, LIMIT: 0xb4100FFF, TYPE=mem
      CFG     ->   BASE_ADDR: 0xb4000000, LIMIT: 0xb4000FFF, TYPE=cfg
      IO      ->   BASE_ADDR: 0xFFFFFFFF, LIMIT: 0xFFFFFFFE, TYPE=io
    
    Suppose PCIe has just completed a CFG access.  To switch back to IO, it
    sets the BASE_ADDR to 0xFFFFFFFF, LIMIT 0xFFFFFFFE and TYPE to IO.  When
    another CFG comes, the BASE_ADDR is set to 0xb4000000 to switch to CFG.  At
    this moment, a MEMORY access shows up, since it matches with iatu0 (due to
    0xb4000000 <= MEMORY BASE_ADDR <= MEMORY LIMIT <= 0xFFFFFFF), it is treated
    as an IO access by mistake, then sent to perpheral.
    
    This patch fixes the problem by exchanging the assignments of `MEMORYs' and
    `CFGs/IOs', which assigning MEMORYs to iatu0, CFGs and IOs to iatu1.
    
    We can still have issues with IO transfer, however memory transfer is used
    predominantly therefore we are just minimizing the risk of failure.
    Actually, we can not do much when we have only two viewports.  We can
    either not allow the less frequent IO transfers at all, or can live with a
    remote possibility of getting it corrupted.

I found the following commit:

commit fe48cb8538421fbd16ecf8bf95829faf8d8c001e
Author: Pratyush Anand <pratyush.anand@gmail.com>
Date:   Mon Jul 4 21:44:42 2016 +0530

    PCI: designware: Keep viewport fixed for IO transaction if num_viewport > 2
    
    Most of the platforms have 3 or more viewports.  For such platforms, We do
    not need to share viewports between IO and CFG.  Assign viewport 2 to IO
    transactions in such cases.

Enboldened, I made the following modifications to the device tree:

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 2ba0f2743d01..9d4311539b20 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -788,6 +788,7 @@ pcie0: pcie@600000 {
                                compatible = "qcom,pcie-msm8996", "snps,dw-pcie";
                                status = "disabled";
                                power-domains = <&gcc PCIE0_GDSC>;
+                               num-viewport = <3>;
                                bus-range = <0x00 0xff>;
                                num-lanes = <1>;
 
@@ -837,6 +838,7 @@ pcie0: pcie@600000 {
                        pcie1: pcie@608000 {
                                compatible = "qcom,pcie-msm8996", "snps,dw-pcie";
                                power-domains = <&gcc PCIE1_GDSC>;
+                               num-viewport = <3>;
                                bus-range = <0x00 0xff>;
                                num-lanes = <1>;
 
@@ -888,6 +890,7 @@ pcie1: pcie@608000 {
                        pcie2: pcie@610000 {
                                compatible = "qcom,pcie-msm8996", "snps,dw-pcie";
                                power-domains = <&gcc PCIE2_GDSC>;
+                               num-viewport = <3>;
                                bus-range = <0x00 0xff>;
                                num-lanes = <1>;
                                status = "disabled";

Nothing crashed or complained but, unfortunately, it made no difference to the issue at hand.
However, perhaps it might be worth including the above patch.