Thursday, June 21, 2018

[TensorFlow] How to get CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) in a bash script for building TensorFlow from source

The AVX and SSE4.2 and others are offered by Intel CPU. (AVX and SSE4.2 are CPU infrastructures for faster matrix computations) Did you wonder what CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) you should use on your machine when building Tensorflow from source? If so, here is a quick solution for you.



1. Create a bash shell script file ( get_tf_build_cpu_opt.sh ) as below:
#!/usr/bin/env bash

# Detect platform
if [ "$(uname)" == "Darwin" ]; then
        # MacOS
        raw_cpu_flags=`sysctl -a | grep machdep.cpu.features | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]'`
elif [ "$(uname)" == "Linux" ]; then
        # GNU/Linux
        raw_cpu_flags=`grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]'`
else
        echo "Unknown plaform: $(uname)"
        exit -1
fi

COPT="--copt=-march=native"

for cpu_feature in $raw_cpu_flags
do
        case "$cpu_feature" in
                "sse4.1" | "sse4.2" | "ssse3" | "fma" | "cx16" | "popcnt" | "maes")
                    COPT+=" --copt=-m$cpu_feature"
                ;;
                "avx1.0")
                    COPT+=" --copt=-mavx"
                ;;
                *)
                        # noop
                ;;
        esac
done
echo $COPT

2. Execute it:
$ ./get_tf_build_cpu_opt.sh
==>  In my machine, I got these:
--copt=-march=native --copt=-mssse3 --copt=-mfma --copt=-mcx16 --copt=-mpopcnt

3. Now you can put these string in your bazel build command to build TensorFlow from source such as:
$ bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package

If you wonder why we need to enable these CPU parameters, please check out this link:
https://fast-depth-coding.readthedocs.io/en/latest/tf-speed.html
https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions

For instance:
$ bazel build --config=mkl \
  -c opt \
  --copt=-march=native \
  --copt=-mssse3 \
  --copt=-mcx16 \
  --copt=-msse4.2 \
  --copt=-msse4.1 \
  --copt=-O3 \
  --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0 \
  //tensorflow/tools/pip_package:build_pip_package


No comments: