Could you tell me advices to fix problems to build TensorFlow

tnishinaga · July 29, 2018, 8:24am

Hello.
Since I found some problems when I building machine learning(TensorFlow + Jupyter notebook) in docker environment on SynQuacer, I would like to share it.
Could you tell me some advices to fix these problems?

Failed to install TensorFlow from pip
“pip install” command is too slow
Failed to build TensorFlow because of -mfpu option
local_resorces option is not working when build TF with bazel

Here is Dockerfile for reproduction:

gist.github.com

https://gist.github.com/tnishinaga/2b09bc8fe48615bcc4fd88df567d2af6

Dockerfile

FROM linaro/base-arm64-ubuntu:xenial

COPY sources.list /etc/apt/sources.list


RUN apt-get update && apt-get install -y  --no-install-recommends \
    build-essential \
    curl \
    libfreetype6-dev \
    libhdf5-dev \

This file has been truncated. show original

sources.list

deb http://ports.ubuntu.com xenial main restricted universe multiverse
deb-src http://ports.ubuntu.com xenial main restricted universe multiverse

deb http://ports.ubuntu.com xenial-updates main restricted universe multiverse
deb-src http://ports.ubuntu.com xenial-updates main restricted universe multiverse

tensorflow_build_failure.log

root@cbe112747f7d:~# ./buildtf.sh $HOME/tensorflow/
WARNING: Running Bazel server needs to be killed, because the startup options are different.
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.15.0- (@non-git) installed.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
Configuration finished
Starting local Bazel server and connecting to it...
.........................................................

This file has been truncated. show original

And here is my SynQuacer spec:


RAM	4GB (I couldn’t find compatible DIMM, which should I use?)
HDD	1TB
Host OS	Debian
Kernel	4.14.32.linaro.281-1
Container OS	linaro/base-arm64-ubuntu:xenial

1. Failed to install TensorFlow from pip:

I tried to install TensorFlow from pip, but it was failed.

Here is error log:

root@5e2a3172b85b:~# pip3 install tensorflow
Collecting tensorflow
  Could not find a version that satisfies the requirement tensorflow (from versions: )
No matching distribution found for tensorflow
You are using pip version 8.1.1, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

It seems there is no prebuild aarch64 TensorFlow binary. So, I need to build it from source code.
However, I want to install from pip because it is not easy to use.

2.pip install command is too slow:

I ran below command, but I had to wait for about 3 hour to finish this command.

# pip3 --no-cache-dir install Pillow ipykernel  jupyter gast grpcio absl-py protobuf tensorboard scipy

Here is top command result when executing pip install command.

The pip install command executing the cc1plus using single core( to build native extension?).
I thinking it is cause of this problem.

Similar problems are discussed at StackOverflow, but I could not find good answer.

I want to fix this problem to shorten the time on low-power multicore machine(like SynQuacer).
However, I didn’t know where to fix. Could someone help me to solve it?

3. Failed to build TensorFlow because of -mfpu option

I couldn’t build TensorFlow because TensorFlow’s build system sets -mfpu=neon option to gcc.

Here is error log:

 /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -MD -MF bazel-out/arm-opt/bin/tensorflow/contrib/lite/kernels/internal/_objs/neon_tensor_utils/tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.pic.d '-frandom-seed=bazel-out/arm-opt/bin/tensorflow/contrib/lite/kernels/internal/_objs/neon_tensor_utils/tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.pic.o' -fPIC -iquote . -iquote bazel-out/arm-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/arm-opt/genfiles/external/bazel_tools -iquote external/arm_neon_2_x86_sse -iquote bazel-out/arm-opt/genfiles/external/arm_neon_2_x86_sse -iquote external/gemmlowp -iquote bazel-out/arm-opt/genfiles/external/gemmlowp -funsafe-math-optimizations -ftree-vectorize -fomit-frame-pointer -O3 '-mfpu=neon' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc -o bazel-out/arm-opt/bin/tensorflow/contrib/lite/kernels/internal/_objs/neon_tensor_utils/tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.pic.o)
gcc: error: unrecognized command line option '-mfpu=neon'

I created the ad-hockery patch to avoid this problem, but it is not fundamental solution.

https://github.com/tnishinaga/tensorflow/commit/8a11597ee0cc5823e3dc57aa7682970e56b51517

I think it is cause of this problem that TensorFlow’s build system recognizes SynQuacer as ARM32 environment.

4. local_resorces option is not working when build TF with bazel:

I set local_resources option to bazel to limit the memory because my SynQuacer’s RAM is 4GB. (I’m using bazel 0.50)

Here is the command to build TensorFlow:

# bazel build -c opt \
     --copt="-mcpu=cortex-a53+fp" \
     --verbose_failures tensorflow/tools/pip_package:build_pip_package \
     --local_resources 3072,24.0,1.0

However, local_resorces option seems not working.

virtual memory exhausted: Cannot allocate memory
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 24530.257s, Critical Path: 23750.57s
INFO: 180 processes: 180 local.
FAILED: Build did NOT complete successfully
root@949cda07f529:~/tensorflow-1.9.0-rc2#

I got “Cannot allocate memory” error when swap enabled.
And crash the docker process when swap disabled.

I change the build command to avoid this problem, however it is not good way.

bazel build -c opt \
     --copt="-mcpu=cortex-a53+fp" \
     --verbose_failures tensorflow/tools/pip_package:build_pip_package \
     -j 3

I think the bazel should adjust it.

Loic · July 30, 2018, 12:55pm

tnishinaga:

3. Failed to build TensorFlow because of -mfpu option

I couldn’t build TensorFlow because TensorFlow’s build system sets -mfpu=neon option to gcc.

Here is error log:

 /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -MD -MF bazel-out/arm-opt/bin/tensorflow/contrib/lite/kernels/internal/_objs/neon_tensor_utils/tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.pic.d '-frandom-seed=bazel-out/arm-opt/bin/tensorflow/contrib/lite/kernels/internal/_objs/neon_tensor_utils/tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.pic.o' -fPIC -iquote . -iquote bazel-out/arm-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/arm-opt/genfiles/external/bazel_tools -iquote external/arm_neon_2_x86_sse -iquote bazel-out/arm-opt/genfiles/external/arm_neon_2_x86_sse -iquote external/gemmlowp -iquote bazel-out/arm-opt/genfiles/external/gemmlowp -funsafe-math-optimizations -ftree-vectorize -fomit-frame-pointer -O3 '-mfpu=neon' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc -o bazel-out/arm-opt/bin/tensorflow/contrib/lite/kernels/internal/_objs/neon_tensor_utils/tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.pic.o)
gcc: error: unrecognized command line option '-mfpu=neon'

I created the ad-hockery patch to avoid this problem, but it is not fundamental solution.

Neon/ASIMD was optional for ARMv7 and is mandatory with ARMv8 [1], AFAIK this is why it’s not a valid option with aarch64 toolchaain.

[1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CJHECGIH.html

danielt · July 30, 2018, 1:12pm

I recently build Tensorflow, keras and jupyter for Developerbox and experienced pretty much the same set of problems you did.

I was not as heavily impacted by bazel’s memory usage because I have a 8GB installed in my Developerbox. Having said that, with the default bazel arguments even an 8GB was still swapping heavily. I chose to use -ram_utilization_factor to limit the parallelism and avoid swapping.

tnishinaga · September 9, 2018, 10:43pm

I chose to use -ram_utilization_factor to limit the parallelism and avoid swapping.

I tried --ram_utilization_factor 50, however the result is same as --local_resources 3072,24.0,1.0.

danielt · September 10, 2018, 9:25am

I chose to use -ram_utilization_factor to limit the parallelism and avoid swapping.

I tried --ram_utilization_factor 50, however the result is same as --local_resources 3072,24.0,1.0.

According to the documentation the ram usage estimation is extremely
crude (e.g. known to be inaccurate) so both --ram_utilization_factor and
–local_resources are tuning (they don’t hard limit the amount of RAM
used).

Given you have less RAM than me then you should try a more aggressive
value (-j3 is very aggressive on a 24 core system) for one or both of
these values.

Maybe even try something extreme such as setting the ram utilization to
10 and leaving overnight with some logging to help detect thrashing.