Yocto vanilla kernel sometimes boots, sometimes not

Hi!

We are using yocto (warrior) with the poky kernel, which used to be 5.0.7 and u-boot and fitImage. This seems to have worked fine so far, but when I updated the kernel to 5.0.x, x < 7, the boot fails. Sadly, this happens right when the kernel takes over, hence, the last thing I see is

“Starting kernel …”

Since this is for me not quite enough to understand, where the kernel hangs, I need more possibilities how one can debug that. Further more, right now, on some builds, even 5.0.7 kernel hangs. I am pretty sure, it has nothing to do with the changes that happen in the kernel, since the kernel breaks before event the first printk happens.

The funny thing is, that when I compare the binaries of a working or not working kernel, they almost do not differ. The places, where they differ seem to me be places where the build time is mentioned…

Does anyone have any hints, how to debug this???

I guess, I am using the wrong addresses for loading? But then, why does it work for one, and fail for another? It seems a bit random…

Regards,

Matthias

Maybe it’s due to the kernel size? There are recent changes in LK [1] to fix some memory size issues, which should be present in the latest snapshot builds [2] (emmc_appsboot.mbn to flash to aboot partition), Not sure it will work in your case since AFAIU you’re running LK->u-boot->Linux, but maybe u-boot requests the same change.

[1] working/qualcomm/lk.git - [no description]
[2] http://snapshots.linaro.org/96boards/dragonboard410c/linaro/rescue/latest/

Hi @Loic,

thanks, that is at least something to look after. Size of working / not working Kernel / fitImage to not differ. And yes, we are chainloading LK → u-boot → Linux. I will look into it, but more ideas welcome :wink:

Regards, Matthias

Hi!

So I could finally solve the issue, as far as I can say. linux/Documentation/arm64/booting.txt says something, about fdt must be 8 byte aligned. fitImage will only 4 byte align the fdt. This means, it is kind of random if you build a kernel, that boots or not. I could fix this by setting fdt_high to a 8 byte aligned address in u-boot. I do not know if lk is also affected, but fdt address might be fixed. Maybe, someone stumbles upon this and this helps…

Regards,

Matthias

Interesting, wondering why you observe this with vanilla kernel only… which version of u-boot are you running? how do you generate your fitimage (parameters)? Is fdt_high set by default?

Maybe it will also show on the linaro kernel, if you change the options such that you will get a modulo 4 byte shorter kernel…? It solves some of the problems, but not all, I must say. Initially fdt_high was set to 0xffffffffffffffff. u-boot is 2019.01, but 2019.07 does not make a difference it seems. The fitImage gets generated with yocto standard. I attach the its file.

 /dts-v1/;

/ {
        description = "U-Boot fitImage for yocto distro/5.0.19+gitAUTOINC+7f6e97c357_55dd15336b/lyd-db410c";
        #address-cells = <1>;

        images {
                kernel@1 {
                        description = "Linux kernel";
                        data = /incbin/("linux.bin");
                        type = "kernel";
                        arch = "arm64";
                        os = "linux";
                        compression = "gzip";
                        load = <0x80080000>;
                        entry = <0x80080000>;
                        hash@1 {
                                algo = "sha1";
                        };
                };
                fdt@qcom_apq8016-sbc.dtb {
                        description = "Flattened Device Tree blob";
                        data = /incbin/("arch/arm64/boot/dts/qcom/apq8016-sbc.dtb");
                        type = "flat_dt";
                        arch = "arm64";
                        compression = "none";
                        
                        hash@1 {
                                algo = "sha1";
                        };
                };
	};

        configurations {
                default = "conf@qcom_apq8016-sbc.dtb";
                conf@qcom_apq8016-sbc.dtb {
			description = "1 Linux kernel, FDT blob";
			kernel = "kernel@1";
			fdt = "fdt@qcom_apq8016-sbc.dtb";
			
			
                        hash@1 {
                                algo = "sha1";
                        };
                };
	};
};

Actually, it does not seem to be a kernel issue but a bootloader issue. As you noticed, Linux expects a 8-byte aligned fdt. For our openembedded and debian releases, we rely on lk (little-kernel) bootloader (only) which correctly loads the DTB to 8-byte aligned address (0x83e00000). In your case, there is u-boot in between which tries to run kernel with in-place fdt (since fdt_high=0xffffff…) and in case of fit image, fdt is not guaranteed to be 8-byte aligned (only 4-byte AFAIU), causing the issue on Linux boot. A solution can be to disable the default in-place DTB, like you did, by setting fdt_high to a valid address. Or removing the fdt_high variable and setting bootm_size one. For now can you just try the following simple u-boot patch and let me know: loic.poulain/u-boot.git - [no description]

That also means there is a bug here and that any arm64 board running uboot+linux can be impacted by this issue…