{TOOL} Bionic-Builder -->> Automated Kernel & Ubuntu Builder for Hikey970

Is openCL gpu drivers are supported?

I have included the User space Driver.
The binary library is /usr/lib/aarch64-linux-gnu/libmali.so

I will work on installing the OpenCl and report back.

YES —>>> openCL Platforms are available. See post below.

The included kernel has the Mali Driver and the user-space binary from Lebain with Tensorflow is included in the image.

To enable the opencl and see the platforms do the following.

sudo apt-get install mesa-utils
sudo apt-get install ocl-icd-* opencl-headers

Then run 
---------------------------------------------------------------------------------------------
dpkg -s libglu1-mesa

You should see the following.
Package: libglu1-mesa
Status: install ok installed
Priority: optional
Section: libs
Installed-Size: 411
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Architecture: arm64
Multi-Arch: same
Source: libglu
Version: 9.0.0-2.1build1
Replaces: libglu1
Provides: libglu1
Depends: libc6 (>= 2.17), libgcc1 (>= 1:3.0), libgl1-mesa-glx | libgl1, libstdc+                                   +6 (>= 4.1.1)
Conflicts: libglu1, mesag3 (<< 5.0.0-1), xlibmesa3
Description: Mesa OpenGL utility library (GLU)

Finish things up by running.
----------------------------------------------------------------------------------------------------------
sudo mkdir -p /etc/OpenCL/vendors/
echo "libmali.so" | sudo tee /etc/OpenCL/vendors/mali.icd

Then you can run CLINFO and see the Mali Platforms.

sudo clinfo

AVAILABLE PLATFORMS
-----------------------------------------------------------------------------------------------------------
root@hikey970:/etc/OpenCL/vendors# cat mali.icd
libmali.so
root@hikey970:/etc/OpenCL/vendors# sudo clinfo
Number of platforms                               1
  Platform Name                                   ARM Platform
  Platform Vendor                                 ARM
  Platform Version                                OpenCL 2.0 v1.r10p0-01rel0.e99                                   0c3e3ae25bde6c6a1b96097209d52
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomi                                   cs cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_l                                   ocal_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes                                    cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd                                    cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_arm_core_id c                                   l_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_                                   import_memory cl_arm_shared_virtual_memory
  Platform Extensions function suffix             ARM

  Platform Name                                   ARM Platform
Number of devices                                 1
  Device Name                                     Mali-G72
  Device Vendor                                   ARM
  Device Vendor ID                                0x62210001
  Device Version                                  OpenCL 2.0 v1.r10p0-01rel0.e99                                   0c3e3ae25bde6c6a1b96097209d52
  Driver Version                                  2.0
  Device OpenCL C Version                         OpenCL C 2.0 v1.r10p0-01rel0.e                                   990c3e3ae25bde6c6a1b96097209d52
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               12
  Max clock frequency                             767MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             384x384x384
  Max work group size                             384
  Preferred work group size multiple              4
  Preferred / native vector sizes
    char                                                16 / 4
    short                                                8 / 2
    int                                                  4 / 1
    long                                                 2 / 1
    half                                                 8 / 2        (cl_khr_fp                                   16)
    float                                                4 / 1
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    64, Little-Endian
  Global memory size                              4294967296 (4GiB)
  Error Correction support                        No
  Max memory allocation                           1073741824 (1024MiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       Yes
  Shared Virtual Memory (SVM) capabilities (ARM)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        524288 (512KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   32 bytes
    Pitch alignment for 2D image buffers          64 pixels
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                64
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Queue properties (on device)
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                2097152 (2MiB)
    Max size                                      16777216 (16MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels
  Device Extensions                               cl_khr_global_int32_base_atomi                                   cs cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_l                                   ocal_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes                                    cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd                                    cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_arm_core_id c                                   l_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_                                   import_memory cl_arm_shared_virtual_memory

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G72
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platfor                                   m
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G72
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in                                    platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in plat                                   form
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G72

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1

Never mind my bad …

Thank you for letting me know. I have not seen that happen.
If you are going to flash the ubuntu_bionic.hikey970.V-2.0.sparse.img there is nothing to worry about.

The image has the kernel installed prior to creating the sparse image. So everything that needs to be in the rootfs / system image is there.

The Kernel package is zipped to that file.
That kernel package is made in case you are just building the kernel by itself.
That way you can copy the gz to the hikey970 and install the kernel.


NATIVE KERNEL COMPILING

I will mention as well after you have flashed the image and run /etc/init.sh
everything that needs to be installed to natively compile the kernel on the hikey970 is installed.

So on the hikey you can clone the kernel tree and build on hikey970.
https://github.com/Bigcountry907/linux.git -b hikey970-v4.9-Debain-Working

Then run
export arch=arm64
make arch=arm64 hikey970_defconfig
make arch=arm64 -j8
make arch=arm64 modules_install
make arch=arm64 install

After that the kernel is installed on the board.
But since grub uses different names for Image.gz and the DTB please copy the Image.gz and the DTB to the directory /boot/.

The DTB has to be named kirin970-hikey970.dtb.
it is located in /arch/arm64/boot/dts/hisilicon/ in the kernel tree.
cp -rf /arch/arm64/boot/dts/hisilicon/kirin970-hikey970.dtb /boot/

The Kernel image.gz has to be named Image-hikey970-v4.9.gz.
it is located in /arch/arm64/boot/ in the kernel tree.
cp -rf /arch/arm64/boot/Image.gz /boot/
mv /boot/Image.gz Image-hikey970-v4.9.gz

Previous directions are only for building the kernel on the hikey970.

If anyone has any questions at all feel free to ask.
I will do my best to help.

Thanks… as I said I goofed. I forgot to do ‘sudo -s’ before running ./BB.sh

I’m glad it worked out good for you.
If anyone has any recommendations or would like to see other functions added please let me know.

For anyone building the kernel only to use with any image other than the rootfs created by the builder please remember to copy the 1920x1080.bin from the ~/Bionic-Builder/binaries/ folder to /lib/firmware/edid on the hikey970 board.

Otherwise the display may not work.
In the future i will build the firmware into the kernel.

Truly awesome tool. Everything went smoothly except for few self inflicted injuries. The only problem I have is my Ethernet connection behaves erratically. Sometimes it comes up at startup and sometime it doesn’t. If it doesn’t come up during start the manual start also does not work. I have to keep restating until it comes up. I spent lot of time online and fiddling with netplan config file ‘01-dhcp.yaml’. But so far no luck.
Lastly kudos to ‘BigCountry907’ for all the hard work to make this happen.

Thank you very much for your kind words.
I have not experienced the issue with the Ethernet but i don’t use it a lot.

I can tell you that 01-dhcp.yaml is a very very touchy file because of the .yaml format.
The spacing used has to be 100 % correct or you will get errors.
This is good page to reference for netplan configurations.

When i get a chance i will take a look and see if i can get the ethernet to have any problems.
The Ethernet controller is run by one lane of the pcie switch.
https://www.broadcom.com/products/pcie-switches-bridges/pcie-switches/pex8606

Are you running any cards in the Mini Pcie slot or the M.2 Key m slot?
I have seen reports that having a card in both slots has caused at least 1 person problems.

Soon as i get a ATX power supply I will be testing some pcie 1x to 16x risers in the mini and m.2 slots at which time if there are any problems with the switch I should find out pretty quickly.

Thanks… and no I am not running any cards in Mini Pcie slot or the M.2. Yes I was very careful with config file as I read it could spell havoc if not done properly. Meanwhile I will just keep looking for solution online and trying it.

On the board i have if I connect the Ethernet prior to booting up it works almost every time. If i boot first then connect the Ethernet sometimes it will work and sometimes it will not.

Okay after spend lot of time on this issue I am convinced that it is hardware issue. My RJ45 jack seems to be the culprit. If I push the cable hard (even when it is locked in the jack) before startup, it works fine. I tried several different cables with same result. Anyway thanks for all the help.

Sorry to hear about your bad luck.
I have a wireless router that acts the same way.

If you really need the Ethernet to work I would recommend adding a card to the mini pcie slot.

Other than that all i can think is push the cable in tight to where it works and then hot glue it. At least with the hot glue you can always peel the glue off and remove it easily. Maybe the place you purchased the board from will exchange it for you.

Thanks… I will give it a shot.

I thought tesorflow is builtin in this image but I can’t seem to find it. Anyone else tried it ?

Open Cl is included in this image.
You should be able to take the tensorflow binaries from the LeMaker image and use them. Otherwise find a arm64 build of tensorflow or compile tensorflow from source.

Thanks… how do I install headers for kernel 4.9.78-147560-g37a1403604e8 ? Or 4.15.0-50-generic build will do ?

The kernel source is located in the ~/Bionic-Builder/Kernel-SRC/linux/ directory.

Usually after i have ubuntu up and running on the Hikey 970 I clone the kernel src tree and build the kernel on the hikey970.

git clone https://github.com/Bigcountry907/linux.git -b hikey970-v4.9-Debain-Working

Then i will run

make arch=arm64 mrproper
make arch=arm64 hikey970_defconfig
make arch=arm64 -j8
make arch=arm64 install
make arch=arm64 modules_install

I have not checked if the kernel headers are installed using that process or not. But you can check that. If the headers are not installed you have to install the headers that match the kernel version. 4.9.78
The 4.9.78 is the only portion of the version that needs to match.
You can not install headers for 4.15.0-50 if you are running the 4.9.78 kernel.

Here is a tutorial on installing the headers.

Oh wow… thanks for the detailed explanation. I appreciate it. Will give it a try…

Hello, BigCountry907, Is there a URL for ubuntu_bionic.hikey970.V-2.0.sparse.img?