Monday, April 22, 2019

[TVM] Deploy TVM Module using C++ API

When the first time to deal with deploying TVM module using C++ API, I found this official web site: Deploy TVM Module using C++ API which only gives a fine example for deploying, but it doesn't explain how to generate the related files and modules.

So, after several times of trial and error, I figure it out to generate the required files for deploying.
Basically you can refer to the TVM tutorials about compiling models:

Sunday, April 14, 2019

[Experiment] Compare the inference performance of TensorFlow Lite and TVM

I compare the inference performance of both TensorFlow Lite and TVM on my laptop with the same MobileNet model and the same input size of 224*224.
They both assign two threads to do the inference task and see the average inferencing time it spent.
(P.S: Giving the same of 10 threads in these 2 cases )

Wednesday, April 10, 2019

[TensorFlow Lite] The performance experiment of TensorFlow Lite

Based on my previous post: [TensorFlow Lite] My first try with TensorFlow Lite ,
I am just curious about the performance of TensorFlow Lite on the different platforms so that I do a performance experiment for it.
There are a X86 server (no GPU enabled) and a Nvidia TX2 board ( which is ARM based ) to do the experiment. They both run TFLite’s official example using Inception_v3 model and get average inference time as the following table.

Hardware Information:
Nvidia TX2: 4 CPU cores and 1 GPU ( but does not use )
X86 server: 24 CPU cores and no GPU enabled

[XLA] Build Tensorflow XLA AOT Shared Library

Build Tensorflow XLA AOT Shared Library:

Add the following code into tensorflow/compiler/aot/BUILD in TensorFlow source:

    name = "",
    deps = [":embedded_protocol_buffers",
    linkopts=["-shared -Wl,--whole-archive" ],


Tuesday, March 26, 2019

[TensorFlow Lite] Build Tensorflow Lite C++ shared library

Build Tensorflow Lite shared library:

Modify the code: tensorflow/contrib/lite/BUILD in TensorFlow source:
    name = "",
    deps = [":framework",
    linkopts=["-shared -Wl,--whole-archive" ],


[TFCompile] Use XLA AOT Compiler to compile Resnet50 model and do inference

I inspired by this following article and try to do something different because it's approach using Keras has an issue for XLA AOT compiler.
Instead, I download the pre-trained Resnet50 model and optimize simply by the tool:transform_graph

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph='/home/liudanny/git/tensorflow_danny/tensorflow/compiler/aot/myaot_resnet50/resnetv2_imagenet_frozen_graph.pb' \
--out_graph='/home/liudanny/workspace/pyutillib/my_resnet50/optimized_frozen_graph.pb' \
--inputs='input_tensor:0' \
--outputs='softmax_tensor:0' \

#I don't enable the follwoing options due to worse performance.
#fold_batch_norms      <== For XLA AOT, it is not good in the performance
#fold_old_batch_norms  <== For XLA AOT, it is not good in the performance
#quantize_weights' <== XLA AOT doesn't support

Wednesday, March 20, 2019

[AutoKeras] My first try with a simple example of AutoKeras

AutoKeras only supports Python 3.6 so that the running environment has to install Python 3.6. My operation system is Ubuntu 16.04 and it needs to add apt repository first.

Install Python 3.6 and AutoKeras ( Don't remove Python 3.5)
# Install pip3
apt-get install python3-pip
# Install Python 3.6
apt-get install software-properties-common
add-apt-repository ppa:jonathonf/python-3.6
apt-get update
apt-get install python3.6

update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.5 1
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 2
update-alternatives --config python3

ln -s /usr/include/python3.5 /usr/include/python3.6m
pip3 install lws
pip3 install autokeras