Danny's tech notebook | 丹尼技術手札

Wednesday, March 27, 2019

[TensorFlow Lite] Build Tensorflow Lite C++ shared library

Build Tensorflow Lite shared library:

Modify the code: tensorflow/contrib/lite/BUILD in TensorFlow source:

cc_binary(
    name = "libtflite.so",
    deps = [":framework",
        "//tensorflow/contrib/lite/kernels:builtin_ops",
        "//tensorflow/contrib/lite/kernels:eigen_support",
        "//tensorflow/contrib/lite/kernels:gemm_support",
    ],
    linkopts=["-shared -Wl,--whole-archive" ],
    linkshared=1

)

[TFCompile] Use XLA AOT Compiler to compile Resnet50 model and do inference

I inspired by this following article and try to do something different because it's approach using Keras has an issue for XLA AOT compiler.
Kerasモデルをtfcompileでコンパイルする
Instead, I download the pre-trained Resnet50 model and optimize simply by the tool:transform_graph

Download:

wget http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v2_fp32_savedmodel_NHWC.tar.gz
or
wget http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v2_fp32_savedmodel_NHWC_jpg.tar.gz

Transform:

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph='/home/liudanny/git/tensorflow_danny/tensorflow/compiler/aot/myaot_resnet50/resnetv2_imagenet_frozen_graph.pb' \
--out_graph='/home/liudanny/workspace/pyutillib/my_resnet50/optimized_frozen_graph.pb' \
--inputs='input_tensor:0' \
--outputs='softmax_tensor:0' \
--transforms='
  strip_unused_nodes
  fold_constants'

#I don't enable the follwoing options due to worse performance.
#fold_batch_norms      <== For XLA AOT, it is not good in the performance
#fold_old_batch_norms  <== For XLA AOT, it is not good in the performance
#quantize_weights' <== XLA AOT doesn't support

[AutoKeras] My first try with a simple example of AutoKeras

AutoKeras only supports Python 3.6 so that the running environment has to install Python 3.6. My operation system is Ubuntu 16.04 and it needs to add apt repository first.

Install Python 3.6 and AutoKeras ( Don't remove Python 3.5)

# Install pip3
apt-get install python3-pip
# Install Python 3.6
apt-get install software-properties-common
add-apt-repository ppa:jonathonf/python-3.6
apt-get update
apt-get install python3.6

update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.5 1
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 2
update-alternatives --config python3

ln -s /usr/include/python3.5 /usr/include/python3.6m
pip3 install lws
pip3 install autokeras

[TensorFlow] Build TensorFlow v1.12 from the source on Ubuntu 16.04

My previous post: [TensorFlow] How to build your C++ program or application with TensorFlow library using CMake

It is for building TensorFlow from the source based on v1.10. Currently, I want to upgrade it to v1.12 and encounter some problems.

First, my version of ProtoBuf library on my system is v3.6.1 so that we should align its version in the TensorFlow.
Second, it seems that there are a few issues when building TensorFlow v1.12 that we need to deal with it case by case.

[TensorFlow Lite] My first try with TensorFlow Lite

I just take my first try with the example: label_image (tensorflow/contrib/lite/examples/label_image) in TensorFlow Lite and write down the commands that I used.
There are a bunch of information from the offical TensorFlow Lite guide:
https://www.tensorflow.org/lite/guide

1. convert the example of model to tflite format

[Tool] Convert TensorFlow graph to UFF format

The previous post: How to use TensorRT to do inference with TensorFlow model ? has introduced away to do the converting job for UFF format model. But, basically there are 2 ways to do that:

1. Convert TensorFlow's Session GraphDef directly on the fly to UFF format model
==> convert_uff_from_tensorflow()
2. Convert the frozen model file to UFF format model
==> convert_uff_from_frozen_model()

The following code is about the functions to convert TensorFlow graph to UFF format for running with TensorRT.

[OpenCV] Build OpenCV 3.4.4 on TX2

For the reason that I was curious about the performance of OpenCV on TX2 using GPU, I installed OpenCV 3.4.4 (this version and after will integrate with the inference engine) on my TX2 based on the following links.
https://jkjung-avt.github.io/opencv3-on-tx2/
https://www.learnopencv.com/install-opencv-3-4-4-on-ubuntu-16-04/

[Inspecting Graphs] Use TensorFlow's summarize_graph tool to find the input and output node names in the frozen model/graph

When trying to do inferencing using a frozen model from downloading or freezing by yourself, we may encounter a problem about what the input and output node names are in this model? If we cannot figure them out, it is impossible for you to do inferencing correctly. Here is an easy way to get the possible ones: using the tool: "summarize_graph"

bazel build tensorflow/tools/graph_transforms:summarize_graph
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/danny/tmp/faster_rcnn_resnet101_coco_2018_01_28/frozen_inference_graph.pb

Found 1 possible inputs: (name=image_tensor, type=uint8(4), shape=[?,?,?,3])
No variables spotted.
Found 4 possible outputs: (name=detection_boxes, op=Identity) (name=detection_scores, op=Identity) (name=num_detections, op=Identity) (name=detection_classes, op=Identity)
Found 48132698 (48.13M) const parameters, 0 (0) variable parameters, and 4163 control_edges
Op types used: 4688 Const, 885 StridedSlice, 559 Gather, 485 Mul, 472 Sub, 462 Minimum, 369 Maximum, 304 Reshape, 276 Split, 205 RealDiv, 204 Pack, 202 ConcatV2, 201 Cast, 188 Greater, 183 Where, 149 Shape, 145 Add, 109 BiasAdd, 107 Conv2D, 106 Slice, 100 Relu, 99 Unpack, 97 Squeeze, 94 ZerosLike, 91 NonMaxSuppressionV2, 55 Enter, 46 Identity, 45 Switch, 27 Range, 24 Merge, 22 TensorArrayV3, 17 ExpandDims, 15 NextIteration, 12 TensorArrayScatterV3, 12 TensorArrayReadV3, 10 TensorArrayWriteV3, 10 Exit, 10 Tile, 10 TensorArrayGatherV3, 10 TensorArraySizeV3, 6 Transpose, 6 Fill, 6 Assert, 5 Less, 5 LoopCond, 5 Equal, 4 Round, 4 Exp, 4 MaxPool, 3 Pad, 2 Softmax, 2 Size, 2 GreaterEqual, 2 TopKV2, 2 MatMul, 1 All, 1 CropAndResize, 1 ResizeBilinear, 1 Relu6, 1 Placeholder, 1 LogicalAnd, 1 Max, 1 Mean
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/danny/tmp/faster_rcnn_resnet101_coco_2018_01_28/frozen_inference_graph.pb --show_flops --input_layer=image_tensor --input_layer_type=uint8 --input_layer_shape=-1,-1,-1,3 --output_layer=detection_boxes,detection_scores,num_detections,detection_classes

For more information, please refer to this:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#inspecting-graphs

Danny's tech notebook | 丹尼技術手札

Wednesday, March 27, 2019

[TensorFlow Lite] Build Tensorflow Lite C++ shared library

[TFCompile] Use XLA AOT Compiler to compile Resnet50 model and do inference

Thursday, March 21, 2019

[AutoKeras] My first try with a simple example of AutoKeras

Friday, March 15, 2019

[TensorFlow] Build TensorFlow v1.12 from the source on Ubuntu 16.04

Monday, March 11, 2019

[TensorFlow Lite] My first try with TensorFlow Lite

Wednesday, March 6, 2019

[Tool] Convert TensorFlow graph to UFF format

Tuesday, March 5, 2019

[OpenCV] Build OpenCV 3.4.4 on TX2

Monday, February 25, 2019

[Inspecting Graphs] Use TensorFlow's summarize_graph tool to find the input and output node names in the frozen model/graph