Custom kernel build doesn't load modules when booting

Hi All,

Brief background

I have built a custom carrier board for an Intrinsyc open-q 410 board. I also have an Intrinsyc devkit board available to test the open-q 410 boards. The carrier board is designed largely based on the devkit board, but leaves several hardware features (such as USB hub, HDMI output) out. Some of the BLSP pins will also need to be configured differently.

Initially I loaded Linaro onto the open-q 410 board using the instructions for the fastboot method in the 96boards documentation (I can’t add the link here as I am a new user here which restricts me to adding two links to a post).
The board then boots and functions as expected using the devkit, however when booting the board on the custom carrier board, it doesn’t boot “completely” (“completely” implying some modules such as Wifi and Bluetooth do not function, but a custom systemd script which toggles a GPIO high and low does run as expected).

This is the ultimate issue I am trying to solve (so if you have ideas feel free to comment), but the scope of this post is intended for a more specific issue I am trying to fix while solving the main problem.

Current issue I’m trying to solve

A suspicion I had for the 410 board not booting “completely” on my custom carrier board was that the device tree for the default linaro kernel loads modules and sets up BLSP pins differently from how they are used on my custom carrier.

I have linked boot logs from the boot process on the devkit and from srs (the custom carrier board).

My approach is to build the kernel from source and take out modules which I am not using, starting with the display output.

I built the kernel image using the instructions provided by the 96boards documentation.

Details worth noting from how I followed the documentation:

  • I checked out tag debian-qcom-dragonboard410c-19.01, then branched off of that

  • I made the following changes to files, before building the kernel and modules

    File: arch/arm64/boot/dts/qcom/apq8016-sbc.dtsi

    From

    adv_bridge: bridge@39 {
    	status = "okay";
    	...
    }
    

    To

    adv_bridge: bridge@39 {
    	status = "disabled";
    	...
    }
    

    From

    mdss@1a0000 {
    	status = "okay";
    	...
    }
    

    to

    mdss@1a0000 {
    	status = "disabled";
    	...
    }
    
  • I followed the rest of the instructions as in the documentation (including copying over the modules inside db410c-modules created from the modules_install command using a USB after booting the board on the devkit).

The issue is that no modules are loaded when booting the board on the devkit now (ie lsmod returns nothing).

FYI:
uname -r returns 4.14.96.01198-gc6a40c56e419 and I have copied the the modules created from the modules_install command to /lib/modules/4.14.96.01198-gc6a40c56e419

Log files

Files can be access here

  • boot-devkit-4_14_96-no_mod.log (successfully booting on the dev kit)
  • boot-devkit-4_14_96-disable_display-clean.log (this boot doesn’t load any modules anymore)

And another log FYI of the custom carrier board (srs) not booting “completely” after loading on the un-modded linaro kernel from the documentation

  • boot-srs-4_14_96-no_mod.log

Thanks in advance for help regarding getting the modules to load with a custom built kernel, and/or with insights into why the boot process doesn’t load the wifi module when booting the un-modded kernel on my carrier board.

make -j4 Image.gz dtbs KERNELRELEASE=4.14.0-qcomlt-arm64
make -j4 modules KERNELRELEASE=4.14.0-qcomlt-arm64
make modules_install KERNELRELEASE=4.14.0-qcomlt-arm64 INSTALL_MOD_STRIP=1 INSTALL_MOD_PATH=

Did you gave the same KERNERELEASE=4.14.96.01198-gc6a40c56e419 for the above 3 commands.

regards,
vinaysimha

Hi Vinaysimha,
Thanks for the tip. Originally I had not specified the KERNELRELEASE in any of the above commands (this wasn’t in the instructions). I have now repeated all the instructions specifying KERNELRELEASE=4.14.96.01198-gc6a40c56e419 for the three make commands. After copying the new modules folder to /lib/modules, the modules still do not load (lsmod returns nothing). I have uploaded the new boot log in the same folder (boot-devkit-defined_kernel_release.log). A line worth noting may be this:

[    2.268971] x_tables: version magic '4.14.96.01198-gc6a40c56e419-01198-gc6a40c56e419 SMP preempt mod_unload aarch64' should be '4.14.96.01198-gc6a40c56e419 SMP preempt mod_unload aarch64'

Lines similar to this now show frequently in the boot log .

So your kernel is not aligned with modules, your kernel has been built with 4.14.96.01198-gc6a40c56e419 version magic and modules with 4.14.96.01198-gc6a40c56e419-01198-gc6a40c56e419 (which seems wrong).

You can find below the script I use to build kernel/modules:

#!/bin/sh
# Usage: ./build-kernel.sh [kernel-src] [release]
set -e

KERNEL_SRC=${1:-./}
cd ${KERNEL_SRC}

CONFIG="distro.config"
RELEASE=${2:-"4.14.0-qcomlt-arm64"}

export ARCH=arm64
export CROSS_COMPILE=${CROSS_COMPILE:-aarch64-linux-gnu-}

make defconfig ${CONFIG}
make -j4 Image.gz vmlinux  dtbs modules KERNELRELEASE=${RELEASE}
make modules_install KERNELRELEASE=${RELEASE} INSTALL_MOD_STRIP=1 INSTALL_MOD_PATH=./modules

Hi Loic,
Thanks for this tip. I ran your script, then followed the rest of the default instructions (using abootimg and fasboot to copy the image to the device).
The board now seems to load some modules properly, however the wifi doesn’t seem to fully initialise. I have uploaded the new boot log under boot-devkit-disable_display-loic_script.log. When comparing this to boot-devkit-4_14_96-no_mod.log towards the end you notice that both logs read IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready, however on the no_mod log, this is followed by IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready.

Do you/does anyone have any further insights into why this could be the case?

Hi all,
Thanks for the previous feedback. The issues I raised in this thread are now solved.
I thought I’d just note what the issue was for others who may be in a similar situation.

Firstly, regarding ‘all’ modules not loading, @vinaysimha and @Loic were correct in that my kernel release versions weren’t aligning correctly. Specifying a release in each of these commands solved that issue.

The board not booting “completely” on my custom carrier board, was related to a combination of the mdss / display module and Network manager. Initially when I booted the board on the dev kit (with the display on) and used the GUI (X?) to connect to the wifi network, it worked and connected to the network on every boot (on the dev kit). But when I disabled mdss in the device tree, the wifi didn’t even connect when booting on the dev kit. This was resolved by connecting to the wifi network again using the Network Manager CLI (nmcli). This seems to indicate there was some dependency on how the network was stored using the GUI, that required the GUI to be functioning for the network to be recognised. I’m not exactly sure about why this is, so if anyone knows, feel free to comment below (to give better insights into what is actually happening).

The UI WiFi helper is normally just an other client of NetworkManager, and so should then create a connection file (/etc/NetworkManager/system-connections/), nmcli connection should return list of registered connections. Not sure what happened in your case.