Debian boot hangs after OP-TEE flash and system install


#1

Hi,

I followed the instructions here
https://github.com/96boards/documentation/wiki/HiKeyGettingStarted
and here
https://github.com/OP-TEE/optee_os/blob/master/README.md
to install OP-TEE on the HiKey board (LeMaker, 2GB/8GB) and experiment with trusted service development.

Everything works fine until I get to rebooting after the step of pushing the OP-TEE client binaries and kernel with OP-TEE support. The installed system as well as the xtest suite work fine until I reboot the board. After the reboot, the boot process hangs, displaying

ACPI: System description tables not found
ACPI: Failed to init ACPI tables
zswap: default zpool zbud not available 
zswap: pool creation failed

It hangs afterwards and I can not access the OS anymore.

Stupid/funny thing is, it booted one more time in between, and I found the exact same messages in the kernel log.

Did I mess up the filesystem or do the OP-TEE instruction list a wrong filesystem or ptable?
I found that the system image etc. in the Makefile does not differentiate between 8G and 4G boards?
Or do I need to disable ACPI during boot?

Many thanks,
Hanno


#2

Hi Hanno,

If you’re trying to test OP-TEE on Debian, you should only have to follow the steps at https://github.com/OP-TEE/optee_os/blob/master/README.md. Mixing it with the instructions at https://github.com/96boards/documentation/wiki/HiKeyGettingStarted might cause confusion.

What are the exact instructions you followed and steps you performed? How exactly are you pushing the OP-TEE client binaries and kernel? Are you pushing anything else as well? If xtest works fine then you shouldn’t have to make changes to the client binaries and kernel.

Are you trying to rebuild from source? If so, for https://github.com/OP-TEE/optee_os/blob/master/README.md, please make sure you run steps 4.1, 5.1, 5.2 (with hikey_debian.xml), 5.2.3, 5.5.2, 6.

For 8G boards, you also need to make modifications as recommended here (https://github.com/OP-TEE/build/issues/64) before running the steps above.

HTH


#3

Hi,

thanks for the quick reply, I tried the steps mentioned (recovery, 8g ptable) but still the same problem.

In detail and in this sequence (no mixing of instructions), what I did is (exactly as per the instructions, I do not flash anything in addition):

What happens is:

  • after flashing, I get a working system, can setup the wireless network and send and install (via dpkg) the system
  • dmesg shows the aforementioned ACPI errors but the system boots through them, apparently
  • the installation shows a few blockdev ioctl errors at the end, but I seem to have read that this is just a mishandling of return values ?
  • after the installation of the system, (usually) step 6 (tee-supplicant and xtest) work
  • after rebooting/shutting down the board, it does not boot up, with the aforementioned errors (ACPI etc)

I am super grateful for any help, as described I am only setting up the board and have not even come around to playing the the TEE stuff etc.
Is there any possibility to fully reset the board ? Seeing as the first system installation as per https://github.com/96boards/documentation/wiki/HiKeyGettingStarted worked, I get the feeling that there is maybe something conflicting in the two setups I followed, and maybe a fresh start would help ?

Thanks a lot,
Hanno


#4

Hi Hanno,

Sorry for the delay in reply. There were some issues that were fixed recently [1] for the OP-TEE HiKey Debian build, so can you please try again from scratch and see if it works now?

Thanks!

[1] https://github.com/OP-TEE/optee_os/issues/832


#5

Hi,

many thanks for the reply!

It worked, until I did a repo sync, now I get an error while compiling
undefined reference to __getrlimit@glibc_private

This looks like I have a wrong version of glibc, right ?
And while this might be a stupid question, I could not find any hint as to which version I actually do need ?

Also, is there a console-only version of Debian, instead of the alip one, that works with OP-TEE (like the _developer one) ?

Again, thanks !
Hanno


#6

This has been tested recently without issues so it seems like your build might be corrupted. Can you please try starting afresh?

For console-only, you should be able to just change SYSTEM_IMG_URL to point to a developer image instead. E.g.

SYSTEM_IMG_URL=https://builds.96boards.org/releases/hikey/linaro/debian/latest/hikey-jessie_developer_20151130-387-4g.emmc.img.gz

HTH


#7

Hi,

many thanks for the answer.

Exchanging the alip image for the developer image works and also the build works.
However, the resulting/booted kernel does not seem to have the TEE stuff included?
All the indications mentioned here https://github.com/OP-TEE/optee_os/issues/835 seem to point to this …

Can that be or did I forget a step in between ?

Again, thanks a lot for your help, best regards,
Hanno


#8

There was no issue in https://github.com/OP-TEE/optee_os/issues/835. The user tried to run the supplicant but got an error because it was already running.

> the resulting/booted kernel does not seem to have the TEE stuff included?
Based on what observations? Do you see the following files? If yes, then it should be ok.


/bin/xtest
/bin/tee-supplicant
/lib/optee_armtz/*.ta


#9

Hi,

sadly, the board still fails to boot normally (as described in my first two posts) after the installation of the .deb files following make send.
I have also given the board and the instructions (https://github.com/OP-TEE/optee_os/blob/master/README.md) to a colleague in order to find out whether the problem lies with my system or between my ears.
However, he arrives at the same, non-booting state.

Specifically, we followed the exact steps of the instructions, within a new VM, and exchanged the ptable for the 8G version, as suggested in one of the earlier posts. All sources thereby are unchanged, as they come out of the repository. The process according to the instructions works up to the point where, after make send, the optee and the linux .deb are installed.
While the optee .deb installs successfully, the linux .deb installation results in multiple blockdev ioctl errors at the end of the installation.

As this is the part where all boot-related stuff is written/installed, it leads to the thought that boot functionality is corrupted at this point.

Can you think of any point where, even when exactly following the instructions, a step could result in this behavior?
This is extremely confusing and frustrating, since we are following the instructions, but apparently not more people have this problem !?

I would be super grateful for any help, as it is right now, the board is completely unusable! :frowning:


#10

Hi Hanno,

Sorry for the late reply. Were you able to make any progress on this? If not, I’ll try to find some time to test this again. In the meantime, is Debian an absolute requirement for you, or are you just wanting to play with the TEE stuff? If the latter, you can maybe try the simple initramfs build instead, which has been around a lot longer and more stable, i.e. follow steps 4.1, 5.1, 5.2 (with hikey.xml), 5.2.3, 5.5.1 and 6; or the OE build as instructed in https://github.com/Linaro/documentation/blob/master/Reference-Platform/CECommon/OE.md.

Just fyi, the error you previously report above (also below) was due to upgrading from GCC4.9 to GCC5 without first deleting the ‘toolchain’ directory. We’ve since added a note to the README to warn users regarding this.

undefined reference to __getrlimit@glibc_private


#11

Hi,

thanks for the reply and sorry for my late reply.

The booting issue persisted and we could never find out why.
However, this was with version 16.03., with version 16.06. everything works fine.

So, it is resolved, I guess :slight_smile: Thanks for your help !

Hanno


#12

Just to record the fix here, (as already noted above in http://www.96boards.org/forums/topic/debian-boot-hangs-after-op-tee-flash-and-system-install/#post-15186 but not in enough details which was probably why you missed it), the problem was due to a missing patch [1] in the linux repo. This patch wasn’t in the rpb 16.03 branch, but is in the 16.06, which is why 16.06 it worked fine.

[1] https://github.com/torvalds/linux/commit/67dfa1751ce71e629aad7c438e1678ad41054677