The AVX and SSE4.2 and others are offered by Intel CPU. (AVX and SSE4.2 are CPU infrastructures for faster matrix computations) Did you wonder what CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) you should use on your machine when building Tensorflow from source? If so, here is a quick solution for you.
1. Create a bash shell script file ( get_tf_build_cpu_opt.sh ) as below:
2. Execute it:
3. Now you can put these string in your bazel build command to build TensorFlow from source such as:
If you wonder why we need to enable these CPU parameters, please check out this link:
https://fast-depth-coding.readthedocs.io/en/latest/tf-speed.html
https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions
For instance:
$ bazel build --config=mkl \
-c opt \
--copt=-march=native \
--copt=-mssse3 \
--copt=-mcx16 \
--copt=-msse4.2 \
--copt=-msse4.1 \
--copt=-O3 \
--cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0 \
//tensorflow/tools/pip_package:build_pip_package
1. Create a bash shell script file ( get_tf_build_cpu_opt.sh ) as below:
#!/usr/bin/env bash
# Detect platform
if [ "$(uname)" == "Darwin" ]; then
# MacOS
raw_cpu_flags=`sysctl -a | grep machdep.cpu.features | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]'`
elif [ "$(uname)" == "Linux" ]; then
# GNU/Linux
raw_cpu_flags=`grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]'`
else
echo "Unknown plaform: $(uname)"
exit -1
fi
COPT="--copt=-march=native"
for cpu_feature in $raw_cpu_flags
do
case "$cpu_feature" in
"sse4.1" | "sse4.2" | "ssse3" | "fma" | "cx16" | "popcnt" | "maes")
COPT+=" --copt=-m$cpu_feature"
;;
"avx1.0")
COPT+=" --copt=-mavx"
;;
*)
# noop
;;
esac
done
echo $COPT
2. Execute it:
$ ./get_tf_build_cpu_opt.sh
==> In my machine, I got these:--copt=-march=native --copt=-mssse3 --copt=-mfma --copt=-mcx16 --copt=-mpopcnt
3. Now you can put these string in your bazel build command to build TensorFlow from source such as:
$ bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package
If you wonder why we need to enable these CPU parameters, please check out this link:
https://fast-depth-coding.readthedocs.io/en/latest/tf-speed.html
https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions
For instance:
$ bazel build --config=mkl \
-c opt \
--copt=-march=native \
--copt=-mssse3 \
--copt=-mcx16 \
--copt=-msse4.2 \
--copt=-msse4.1 \
--copt=-O3 \
--cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0 \
//tensorflow/tools/pip_package:build_pip_package
No comments:
Post a Comment