Stuck in Initramfs

Thank you @anon91830841, that worked.

Android boot fails with following logs. There are some stranges numbers sequences in the logs. After that, It repeats constantly with:

[ 18.641896] msm_dba_get_probed_device: Device not found (adv7533, 0)
[ 18.647202] msm_dba_register_client: Device not found (adv7533, 0)
[ 18.653378] mdss_dba_utils_init: ds not configured

then reboots and start again…
This makes sense as I have no HDMI on my custom board.
I guess I have to modify the config and rebuild the android kernel… but I would rather use linux directly

In my custom Linux, I had modified the dtbs file by removing everything related to adv in the dtsi file. My kernel boots on a custom board with a SOM that has no HDMI, but does not boot on my custom board. Maybe there is something more I should do about that?

In buildroot, I have this result with fdisk -l:

# fdisk -l
Found valid GPT with protective MBR; using GPT

Disk /dev/mmcblk0: 7634944 sectors, 3728M
Logical sector size: 512
Disk identifier (GUID): 98101b32-bbe2-4bf2-a06e-2bb33d000c20
Partition table holds up to 12 entries
First usable sector is 34, last usable sector is 7634910

Number  Start (sector)    End (sector)  Size Name
     1          131072          131075  2048 cdt
     2          262144          263167  512K sbl1
     3          263168          264191  512K rpm
     4          264192          266239 1024K tz
     5          266240          267263  512K hyp
     6          267264          267295 16384 sec
     7          267296          269343 1024K aboot
     8          269344          400415 64.0M boot
     9          400416          402463 1024K devinfo
    10          402464         7634910 3531M rootfs
Disk /dev/mmcblk0boot1: 2 MB, 2097152 bytes, 4096 sectors
64 cylinders, 4 heads, 16 sectors/track
Units: cylinders of 64 * 512 = 32768 bytes

Disk /dev/mmcblk0boot1 doesn't contain a valid partition table
Disk /dev/mmcblk0boot0: 2 MB, 2097152 bytes, 4096 sectors
64 cylinders, 4 heads, 16 sectors/track
Units: cylinders of 64 * 512 = 32768 bytes

Disk /dev/mmcblk0boot0 doesn't contain a valid partition table

not sure if this Disk /dev/mmcblk0boot0 doesn’t contain a valid partition table error could be a hint for me?
Otherwise my last hope is to find an hardware mistake now…

From Buildroot busybox, I can manually mount and access the rootfs with # mount -t ext4 /dev/mmcblk0p10 /root
Any idea how I can manually switch the root to this rootfs instead of the busybox?

Make the ramdisk invalid (the empty file should work) and the kernel will start honouring the root= argument again. Without a ramdisk you must use specify the rootfs explicitly (/dev/mmcblk0p10) rather than by name (/dev/disk/by-partlabel/boot).

Thank you Daniel;
Without ramdisk the error I have is more repetable (I still have this weird fault: /lib/systemd/system-generators/systemd-gpt-auto-generator terminated by signal SEGV. )
Logs look like this and I still cannot have a full correct boot.
Will have a deeper look at my hardware today.

I tried a new emmc with 8GB (MTFC8GAKAJCN-1M-WT) but the problem is still there.
One difference between my board and dragonboard is the following logs:
On my board: mmcblk0boot0: mmc0:0001 Q2J55L partition 1 2.00 MiB
On dragonboard, mmcblk0boot0 and boot1 are 4 MiB.

Does that matter? I am not sure about the purpose of those mmcblk0boot0/1 partitions…
Thank you,

The boot partitions are separate hardware partititons within eMMC. They need a special protocol to access and are intended to make it easier to separate the bootloader from the OS.

I don’t think it matters much in since I don’t think the DB410C boot ROM uses them. However it does means that you will still need a custom partition table because your main eMMC partition will be 4MB “too big”.

Not sure what the best way to resize them is… Looks like there might be support in u-boot to allow it but I’ve never tried doing any changes to the boot partitions myself.

Thank you for your anwer.
This is interesting, as I see it it could be the source of all my troubles
Will have a deeper look in the next few days, thank you again !

For completeness, here are the exact differences between my board and dragonboard:
dragonboard:
mmcblk0boot0: partition 1 4.00 MiB
mmcblk0boot1: partition 2 4.00 MiB
mmcblk0rpmb: partition 3 4.00 MiB
my custom board:
mmcblk0boot0: partition 1 2.00 MiB
mmcblk0boot1: partition 2 2.00 MiB
mmcblk0rpmb: partition 3 512 KiB

So my guess is that this is the reason there are troubles during the boot.
Here are the different options I consider to solve the problem:

  • Resize the boot partitions in the emmc
    I am not even sure my specific memory allows that
    or
  • Replace the emmc with another chip that has same footprint and boot partitions sizes
    or
  • Customize the partition tables (gpt files) so that the shift in size is taken into account.

I intend to try the latest option that seems the easiest way to do it. But I am not sure if those boot partitions sizes are taken into account in the GPT or in the rawprogram.xml file … Otherwise I could have a deeper look deeper to u-boot to resize the partitions but I have no experience with it (yet).
Thank you for your help,

Hello @danielt
Could you confirm that I need to change my partition table because of this difference in the emmc hardware boot partitions sizes?
I do not see how I could solve the “4MB too big” problem. As far as I know partitions are flashed at sector 0 of main emmc partition anyway and the last partition is resized to fit the remaining space.

Being too big is not a huge problem in terms of accessing partitions… but the GPT will still be technically invalid because the duplicate partition table at then end of the disk won’t actually be at the end of the disk.

We’ve long assumed that the reason for the problems you observe is because your partition table is corrupt (taking as evidence that the initramfs can’t access them by name). However we’ve never really know how it is corrupt since the study you did with buildroot revealed no problems. However we can also guess from your July logs that when you workaround the first problem a second problem emerges, namely a systemd component that parses the GPT starts crashing. Thus the theory continues that somehow your partition table isn’t right.

So I can’t, in all honesty, be sure if you need to change your partition table because we never really understood why it didn’t work for you in the first place.

I wonder if going back to work in buildroot might be useful. In particular experimenting with sfdisk to create a new partition table.

Basically if you run the following then what you get is an script that sfdisk can read to restore the partition table:

sfdisk --dump /dev/mmcblk0 > parts

The partition table is restored using the following:

sfdisk /dev/mmcblk0 < parts

I would recommend removing the first-lba: and last-lba: stanzas from parts before restoring although I confess this is absolute paranoia on my side. When it is restored then the partitions themselves will be the same as before but the partition table will be rebuilt from scratch. Nothing “weird” should survive this.

Hello Daniel,

Thank you for your message.
Any idea how to add sfdisk in buildroot?
I cannot find it in the config… All I have that seem close is fdisk.
I am building buildroot from https://github.com/buildroot/buildroot

Thank you,

Thank you for your message.
Any idea how to add sfdisk in buildroot?
I cannot find it in the config… All I have that seem close is fdisk.
I am building buildroot from https://github.com/buildroot/buildroot

I tried git grep sfdisk (on the buildroot tree) and it looks like you
need to go into the Target packages → System tools → util-linux and
enable both util-linux and the basic set. You can check you have it
right by looking for BR2_PACKAGE_UTIL_LINUX_BINARIES in the .config
file.

PS I didn’t test…

Edit: I could successfully use sfdisk. After that I reflash the rootfs but it does not seems to have solve the problem unfortunately.

I arrived to a point where I can most certainly exclude a software issue. Modifying the boot partitions on the emmc didn’t help.
So I plan to deep check the harware. As far as I know, here are my options:

  • manual check (measurements on the board)
  • checks using buildroot/initramfs
  • JTAG
  • third party verification

I did not thind anything wrong using buildroot, but I can not guarantee I checked everything deeply. Does anyone have any hints on how to make deep clean tests to find any hardware errors with the apq8016?
From others thread, JTAG debug seems quite a pain and not fully suported (see JTAG support: how? - #7 by ljking). Do you think it is worth exploring this regarding my problems?

Also, do you know any third party that can help regarding design/layout verification?
@ljking suggested valydate.com in other threads but their website seems down.
Anyway, any help/suggestion regarding hardware bug finding is very welcome.
Thank you,

Hi @Vix

I also saw the Valydate web site was down. A little research shows why Valydate Archives Semiconductor Engineering They are now part of Mentor Graphics.

The 410c and 820c boards have been through all of the checks that Valydate could have possibly have done. They had access to the IBIS and other models for checking everything that can be checked at pcb layout time. You can get the IBIS models from Arrow and gun them on your own CAD system.

I had concerns when you designed and built your own board, critical to correct operation of the chipset is the power delivery and signal integrity. I hope I pointed out all of the things you have to check when I wrote the design guidelines.

Does your board work if you solder on a 8GB eMMC part? However based on the failure to boot from SDCard, I would guess that the problem is in the Power Delivery Network (PDN) layout.

-Lawrence-
No longer a Qualcomm employee
searching for new employment.

Hello @ljking

Indeed I tried with a 8GB eMMC and the board still fails at boot and falls in initramfs.
So I guess I have to do a deeper design verification by reviewing the processor design guideline document.

Hi Victor @Vix

The other test you could do is to rework a 410c board with your 4GB part. The very first batch of prototype 410c boards we built used a 4GB part, but between the SW team and the marketing team we decided that the board really needed an 8GB part to be a general purpose development board. The other deciding factor for 8GB at the time was the price had dropped so that newer generation 8GB parts were less expensive than the 4GB part.

In your case since the 8GB part doesn’t work on your board I would guess the problem is a layout signal integrity problem which is almost impossible to find with an oscilloscope or software. You need to hunt for it in simulation. Look really hard at impedance and cross-talk in the LPDDR signals. Also look at power distribution impedances. Best of luck finding the issue.

Since you will likely need to rework the PCB, what I would strongly recommend is you get the 410c design files from Arrow, leave the core exactly as-is, then change just the peripheral layout for the items you need.

Another option is to work with one of the Qualcomm partners (Intrinsyc, Inforce, eInfoChips etc.) to design a custom board for you. All of them have done custom boards and made their designs work so I am confident they have the necessary skills, and simulation capabilities. Of course they will likely charge you for their efforts.

Depending on your estimated production volumes buying a System on Module (SOM) from one of the partners might be a reasonable solution. The partners can custom build SOMs with 4GB, but a SOM is almost always more expensive than chips on your own board and not a viable solution if your production volumes are high.

-Lawrence-
No longer a Qualcomm employee
Looking for employment.

Hello @ljking,

Thank you for your message,
The thing is that we already built a custom board with a SoM which was working fine.
Now we tried to build our own board from scratch, based on the 410c design.

Unfortunately we had to use two separate chips for lpddr3 and emmc so I cannot rework the 410c with our emmc chip.

The frustrating part is that I can boot with buildroot and make several stress tests on the memories which all went successfully. The crash when booting the full kernel is probably coming from a transient at boot…

Hi Victor @Vix

The stress tests do run a lot of traffic on the LPDDR, but the traffic patterns (and power usage) are very different than a bunch of apps running on multiple cores. The fact that you can run some tests suggests that you are very close, but not quite good enough. Maybe cross talk between a dynamic power rail and a LPDDR signal? It only takes one single data bit or address line corrupted on one memory transaction to eventually cause a system failure. When you are running stress tests most of the code stays in the cache, and doesn’t need to be re-fetched.

The 410 chipset is used in many cell phone designs, and quite a few of the cell phone OEMs have used separate LPDDR and eMMC chips, so this tells you that it can be done successfully. But of course nobody says it is easy.

The last task the Qualcomm internal team completed was verification of a 410c layout variant that used separate LPDDR and eMMC chips. The design is complete, and verified on physical hardware, however Qualcomm disbanded the team before the design was released. The completed design is in the Qualcomm system, and Arrow or one of the partners may be able to get the design files for you. The downside to the internal design is it required 2 additional layers to fully meet the signal integrity and power delivery requirements.

-Lawrence-
No longer a Qualcomm employee
Searching for employment.