Danny's tech notebook | 丹尼技術手札: 2019

Thursday, October 24, 2019

[Spring Boot] My first Spring Boot Project for demo

Recently, the term "Microservices" is very hot in the cloud service/architecture and it gets my attention. So, I want to learn more about what it is. Furthermore, I also notice Spring Cloud is a very popular microservice framework based on Spring Boot and it provides tools for developers to quickly build some of the common patterns in distributed systems (e.g. configuration management, service discovery, circuit breakers, intelligent routing, micro-proxy, control bus, one-time tokens, global locks, leadership election, distributed sessions, cluster state)

[Dockerfile] Some of the little skills used in Dockerfile

I collect some of the little skills used in my Dockerfiles and also keep in the record for my reference.

[IJava] A Jupyter kernel for executing Java code

As we know that Jupyter’s is a Python-based solution, and it provides a lot of convenience for interactivity with Python code. But, I just wonder if Jupyter can use Java as its kernel, and it can. I find that it allows to integrate additional "kernels" with Java capabilities and someone has done it -IJava: https://github.com/SpencerPark/IJava

[Kubernetes] The simple introduction of VPA

VPA stands for Vertical Pod Autoscaling, which frees you from having to think about what values to specify for a container's CPU and memory requests. The autoscaler can recommend values for CPU and memory requests, or it can automatically update values for CPU and memory requests.

Before using VPA, we need to install Metrics Server first as follows:

[Kubernetes] Call Kubernetes APIs

If you want to use Bash Shell to call Kubernetes APIs, Here is the way to get the token and use it as follows:

Prepare variables

#APISERVER="https://10.10.0.100:6443"
$ APISERVER=$(kubectl config view | grep server | cut -f 2- -d ":" | tr -d " ")
$ TOKEN=$(kubectl describe secret $(kubectl get secrets | grep default | cut -f1 -d ' ') | \
    grep -E '^token' | cut -f2 -d':' | tr -d '\t')

Call APIs

$ curl $APISERVER/api --header "Authorization: Bearer ${TOKEN//[[:space:]]/}" --insecure
{
  "kind": "APIVersions",
  "versions": [
    "v1"
  ],
  "serverAddressByClientCIDRs": [
    {
      "clientCIDR": "0.0.0.0/0",
      "serverAddress": "192.168.0.100:6443"
    }
  ]
}

[Docker] Troubleshooting to docker private registry

I create a private docker registry as follows and sometimes it cannot reply the http request.

$ sudo docker run -p 7443:7443 --restart=always \
  -v /raid/registry2:/var/lib/registry \
  -e REGISTRY_HTTP_ADDR=0.0.0.0:7443 \
  --name registry registry:2

$ curl -v http://192.168.10.10:7443/v2/_catalog
*   Trying 192.168.10.10...
* TCP_NODELAY set
* Connected to 192.168.10.10 (192.168.10.10) port 7443 (#0)
> GET /v2/_catalog HTTP/1.1
> Host: 192.168.0.109:7443
> User-Agent: curl/7.58.0
> Accept: */*
...(hang)...

[Kubernetes] Add new DNS server in CoreDNS Configuration

After finished the K8S installation, I had encountered an issue that the pod cannot resolve the domain name. I use these commands to check the DNS issue.

$ systemd-resolve --status
nameservers:
                  addresses:
                  - 210.240.232.1
                  search: []

# use busybox pod to run nslookup
$ kubectl apply -f https://k8s.io/examples/admin/dns/busybox.yaml
$ kubectl exec -it busybox -- nslookup kubernetes.default
$ kubectl exec -it busybox -- nslookup google.com

# check local DNS configuration
$ kubectl exec busybox cat /etc/resolv.conf

[Qt] How to install Qt for WebAssembly

In the previous post: [Qt] The Qt features of WebAssembly and Qt quick WebGLStreaming, I mentioned about Qt for WebAssembly but didn't try it yet. Here, this post is to introduce how to install Qt for WebAssembly and give an example to try by the following steps:

[Kubernetes] How to install Kubernetes Dashboard and without invalid certification

When I follow the instructions from the official site: https://github.com/kubernetes/dashboard to install Kubernetes Dashboard, I encounter the problem that I cannot access the dashboard via my browser because the certificate is invalid. After figuring it out, Here is my approach to resolving it.

[DDS] Install OpenSplice DCPS Python API on Raspberry Pi 3

Before starting to introduce how to install OpenSplice DCPS Python API on Raspberry Pi 3, we can take a look at the guide of OpenSplice DCPS Python API as list:

http://download.prismtech.com/docs/Vortex/html/ospl/PythonDCPSAPIGuide/index.html

[Kubernetes] The example of commands for commonly checking kubernetes status and troubleshooting

This post is about the example of commands for checking kubernetes status and troubleshooting. The purpose is for the reference by myself.

[Python] The issues of converting Python2 to Python3

If you are working on converting Python2 source code to Python3, there are some issues you will encounter sooner or later. I arrange my part here and will continue to update it as follows:

[Qt] The Qt features of WebAssembly and Qt quick WebGLStreaming

I recently noticed that there are 2 major Qt features: Qt WebAssembly and Qt quick WebGL which are very powerful and useful. For the Qt WebAccembly, any your C++ Qt applications could be executed on a browser. For Qt quick WebGL, it is optimized for Qt Quick and allows you to run remote Qt Quick applications in a browser.

Qt WebAssembly

Qt for WebAssembly

https://doc.qt.io/qt-5/wasm.html

Get started with Qt for WebAssembly

https://blog.qt.io/blog/2018/11/19/getting-started-qt-webassembly/

Qt WebAssembly 內容介紹
https://qtdream.com/topic/1081/qt-webassembly-%E5%86%85%E5%AE%B9%E4%BB%8B%E7%BB%8D/2

Here is a presentation about Qt for WebAccembly.

QtWS18 – Qt for WebAssembly by Morten Sørvig, The Qt Company

Qt WebAssembly Examples
https://github.com/msorvig/qt-webassembly-examples

Qt For WebAssembly Examples
(source code: https://github.com/msorvig/qt-webassembly-examples)
SensorTagDemo
colordebugger
gui_lifecycle
gui_localfiles
gui_opengl
gui_raster
mqtt_simpleclient
quick_clocks
quick_controls2_gallery
quick_controls2_testbench
quick_hellosquare
slate
widget_wiggly
widgets_wiggly

https://www.jianshu.com/p/bed6c2963435

Qt quick WebGL

Here is a quick introduction to Qt quick WebGL

https://blog.qt.io/blog/2018/11/23/qt-quick-webgl-release-512/

The introduction of Qt Quick WebGL Streaming

https://blog.qt.io/blog/2017/02/22/qt-quick-webgl-streaming/

WebGL streaming in a Raspberry PI Zero W

https://blog.qt.io/blog/2017/03/24/webgl-streaming-raspberry-pi-zero-w/

[Shared Library] How to check the used shared libraries in your program

This post is only to show the command about checking the used shared libraries in your program.

[DDS] Install OpenSplice Community Version

What is DDS? The Data Distribution Service (DDS™) is a middleware protocol and API standard for data-centric connectivity from the Object Management Group® (OMG®). It integrates the components of a system together, providing low-latency data connectivity, extreme reliability, and a scalable architecture that business and mission-critical Internet of Things (IoT) applications need. For more information, check it out: https://www.dds-foundation.org/what-is-dds-3/
https://zhuanlan.zhihu.com/p/32278571

In this post, I install OpenSplice as my DDS runtime environment and library. You can download it from here: https://www.adlinktech.com/en/dds-community-software-evaluation.aspx

[TensorFlow] Build TensorFlow from source with Intel MKL enabled

Based on this document "Intel® Math Kernel Library for Deep Learning Networks: Part 1–Overview and Installation", I give a quick summary to do it.
Here are the steps for building TensorFlow from source with Intel MKL enabled.

For MKL DNN:

# Install Intel MKL & MKL-DNN
$ git clone https://github.com/01org/mkl-dnn.git
$ cd mkl-dnn
$ cd scripts && ./prepare_mkl.sh && cd ..
$ mkdir -p build && cd build && cmake .. && make -j$(nproc)
$ make test
$ sudo make install

For building TensorFlow:

[Squid] Setup Linux Proxy Server and Client

The following steps are about installing and setup Linux proxy server (Squid)

#install squid proxy
$ sudo apt-get install squid

#modify squid.conf
$ sudo vi /etc/squid/squid.conf
  ==> #add your local network segment
  acl mynetwork src 192.168.0.0/24
  ==> #allow "mynetwork" be accessing via http
  http_access allow mynetwork

#restart squid...OK
$ sudo service squid restart

[Docker] Using GUI with Docker

Recently I need to run my GUI application with Docker and it can show up either directly on my desktop operating system or in my ssh terminal client via X11.

Basically, there are some people who already provide the solution for the cases. I just list the reference and quickly give my examples.

[TVM] Deploy TVM Module using C++ API

When the first time to deal with deploying TVM module using C++ API, I found this official web site: Deploy TVM Module using C++ API which only gives a fine example for deploying, but it doesn't explain how to generate the related files and modules.

So, after several times of trial and error, I figure it out to generate the required files for deploying.
Basically you can refer to the TVM tutorials about compiling models:
https://docs.tvm.ai/tutorials/index.html

[Experiment] Compare the inference performance of TensorFlow Lite and TVM

I compare the inference performance of both TensorFlow Lite and TVM on my laptop with the same MobileNet model and the same input size of 224*224.
They both assign two threads to do the inference task and see the average inferencing time it spent.
(P.S: Giving the same of 10 threads in these 2 cases )

[TensorFlow Lite] The performance experiment of TensorFlow Lite

Based on my previous post: [TensorFlow Lite] My first try with TensorFlow Lite ,
I am just curious about the performance of TensorFlow Lite on the different platforms so that I do a performance experiment for it.
There are a X86 server (no GPU enabled) and a Nvidia TX2 board ( which is ARM based ) to do the experiment. They both run TFLite’s official example using Inception_v3 model and get average inference time as the following table.

Hardware Information:
Nvidia TX2: 4 CPU cores and 1 GPU ( but does not use )
X86 server: 24 CPU cores and no GPU enabled

[XLA] Build Tensorflow XLA AOT Shared Library

Build Tensorflow XLA AOT Shared Library:

Add the following code into tensorflow/compiler/aot/BUILD in TensorFlow source:

cc_binary(
    name = "libxlaaot.so",
    deps = [":embedded_protocol_buffers",
        "//tensorflow/compiler/tf2xla:xla_jit_compiled_cpu_function",
        "//tensorflow/compiler/tf2xla",
        "//tensorflow/compiler/tf2xla:cpu_function_runtime",
        "//tensorflow/compiler/tf2xla:tf2xla_proto",
        "//tensorflow/compiler/tf2xla:tf2xla_util",
        "//tensorflow/compiler/tf2xla:xla_compiler",
        "//tensorflow/compiler/tf2xla/kernels:xla_cpu_only_ops",
        "//tensorflow/compiler/tf2xla/kernels:xla_dummy_ops",
        "//tensorflow/compiler/tf2xla/kernels:xla_ops",
        "//tensorflow/compiler/xla:shape_util",
        "//tensorflow/compiler/xla:statusor",
        "//tensorflow/compiler/xla:util",
        "//tensorflow/compiler/xla:xla_data_proto",
        "//tensorflow/compiler/xla/client:client_library",
        "//tensorflow/compiler/xla/client:compile_only_client",
        "//tensorflow/compiler/xla/client:xla_computation",
        "//tensorflow/compiler/xla/service:compiler",
        "//tensorflow/compiler/xla/service/cpu:buffer_info_util",
        "//tensorflow/compiler/xla/service/cpu:cpu_compiler",
        "//tensorflow/core:core_cpu_internal",
        "//tensorflow/core:framework_internal",
        "//tensorflow/core:lib",
        "//tensorflow/core:lib_internal",
        "//tensorflow/core:protos_all_cc",
        ":tfcompile_lib",
        "//tensorflow/compiler/xla/legacy_flags:debug_options_flags",
        "//tensorflow/core:core_cpu",
        "//tensorflow/core:framework",
        "//tensorflow/core:graph",
        "@com_google_absl//absl/memory",
        "@com_google_absl//absl/strings",
        "@com_google_absl//absl/types:span",
    ],
    linkopts=["-shared -Wl,--whole-archive" ],
    linkshared=1

)

[TensorFlow Lite] Build Tensorflow Lite C++ shared library

Build Tensorflow Lite shared library:

Modify the code: tensorflow/contrib/lite/BUILD in TensorFlow source:

cc_binary(
    name = "libtflite.so",
    deps = [":framework",
        "//tensorflow/contrib/lite/kernels:builtin_ops",
        "//tensorflow/contrib/lite/kernels:eigen_support",
        "//tensorflow/contrib/lite/kernels:gemm_support",
    ],
    linkopts=["-shared -Wl,--whole-archive" ],
    linkshared=1

)

[TFCompile] Use XLA AOT Compiler to compile Resnet50 model and do inference

I inspired by this following article and try to do something different because it's approach using Keras has an issue for XLA AOT compiler.
Kerasモデルをtfcompileでコンパイルする
Instead, I download the pre-trained Resnet50 model and optimize simply by the tool:transform_graph

Download:

wget http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v2_fp32_savedmodel_NHWC.tar.gz
or
wget http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v2_fp32_savedmodel_NHWC_jpg.tar.gz

Transform:

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph='/home/liudanny/git/tensorflow_danny/tensorflow/compiler/aot/myaot_resnet50/resnetv2_imagenet_frozen_graph.pb' \
--out_graph='/home/liudanny/workspace/pyutillib/my_resnet50/optimized_frozen_graph.pb' \
--inputs='input_tensor:0' \
--outputs='softmax_tensor:0' \
--transforms='
  strip_unused_nodes
  fold_constants'

#I don't enable the follwoing options due to worse performance.
#fold_batch_norms      <== For XLA AOT, it is not good in the performance
#fold_old_batch_norms  <== For XLA AOT, it is not good in the performance
#quantize_weights' <== XLA AOT doesn't support

[AutoKeras] My first try with a simple example of AutoKeras

AutoKeras only supports Python 3.6 so that the running environment has to install Python 3.6. My operation system is Ubuntu 16.04 and it needs to add apt repository first.

Install Python 3.6 and AutoKeras ( Don't remove Python 3.5)

# Install pip3
apt-get install python3-pip
# Install Python 3.6
apt-get install software-properties-common
add-apt-repository ppa:jonathonf/python-3.6
apt-get update
apt-get install python3.6

update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.5 1
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 2
update-alternatives --config python3

ln -s /usr/include/python3.5 /usr/include/python3.6m
pip3 install lws
pip3 install autokeras

[TensorFlow] Build TensorFlow v1.12 from the source on Ubuntu 16.04

My previous post: [TensorFlow] How to build your C++ program or application with TensorFlow library using CMake

It is for building TensorFlow from the source based on v1.10. Currently, I want to upgrade it to v1.12 and encounter some problems.

First, my version of ProtoBuf library on my system is v3.6.1 so that we should align its version in the TensorFlow.
Second, it seems that there are a few issues when building TensorFlow v1.12 that we need to deal with it case by case.

[TensorFlow Lite] My first try with TensorFlow Lite

I just take my first try with the example: label_image (tensorflow/contrib/lite/examples/label_image) in TensorFlow Lite and write down the commands that I used.
There are a bunch of information from the offical TensorFlow Lite guide:
https://www.tensorflow.org/lite/guide

1. convert the example of model to tflite format

[Tool] Convert TensorFlow graph to UFF format

The previous post: How to use TensorRT to do inference with TensorFlow model ? has introduced away to do the converting job for UFF format model. But, basically there are 2 ways to do that:

1. Convert TensorFlow's Session GraphDef directly on the fly to UFF format model
==> convert_uff_from_tensorflow()
2. Convert the frozen model file to UFF format model
==> convert_uff_from_frozen_model()

The following code is about the functions to convert TensorFlow graph to UFF format for running with TensorRT.

[OpenCV] Build OpenCV 3.4.4 on TX2

For the reason that I was curious about the performance of OpenCV on TX2 using GPU, I installed OpenCV 3.4.4 (this version and after will integrate with the inference engine) on my TX2 based on the following links.
https://jkjung-avt.github.io/opencv3-on-tx2/
https://www.learnopencv.com/install-opencv-3-4-4-on-ubuntu-16-04/

[Inspecting Graphs] Use TensorFlow's summarize_graph tool to find the input and output node names in the frozen model/graph

When trying to do inferencing using a frozen model from downloading or freezing by yourself, we may encounter a problem about what the input and output node names are in this model? If we cannot figure them out, it is impossible for you to do inferencing correctly. Here is an easy way to get the possible ones: using the tool: "summarize_graph"

bazel build tensorflow/tools/graph_transforms:summarize_graph
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/danny/tmp/faster_rcnn_resnet101_coco_2018_01_28/frozen_inference_graph.pb

Found 1 possible inputs: (name=image_tensor, type=uint8(4), shape=[?,?,?,3])
No variables spotted.
Found 4 possible outputs: (name=detection_boxes, op=Identity) (name=detection_scores, op=Identity) (name=num_detections, op=Identity) (name=detection_classes, op=Identity)
Found 48132698 (48.13M) const parameters, 0 (0) variable parameters, and 4163 control_edges
Op types used: 4688 Const, 885 StridedSlice, 559 Gather, 485 Mul, 472 Sub, 462 Minimum, 369 Maximum, 304 Reshape, 276 Split, 205 RealDiv, 204 Pack, 202 ConcatV2, 201 Cast, 188 Greater, 183 Where, 149 Shape, 145 Add, 109 BiasAdd, 107 Conv2D, 106 Slice, 100 Relu, 99 Unpack, 97 Squeeze, 94 ZerosLike, 91 NonMaxSuppressionV2, 55 Enter, 46 Identity, 45 Switch, 27 Range, 24 Merge, 22 TensorArrayV3, 17 ExpandDims, 15 NextIteration, 12 TensorArrayScatterV3, 12 TensorArrayReadV3, 10 TensorArrayWriteV3, 10 Exit, 10 Tile, 10 TensorArrayGatherV3, 10 TensorArraySizeV3, 6 Transpose, 6 Fill, 6 Assert, 5 Less, 5 LoopCond, 5 Equal, 4 Round, 4 Exp, 4 MaxPool, 3 Pad, 2 Softmax, 2 Size, 2 GreaterEqual, 2 TopKV2, 2 MatMul, 1 All, 1 CropAndResize, 1 ResizeBilinear, 1 Relu6, 1 Placeholder, 1 LogicalAnd, 1 Max, 1 Mean
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/danny/tmp/faster_rcnn_resnet101_coco_2018_01_28/frozen_inference_graph.pb --show_flops --input_layer=image_tensor --input_layer_type=uint8 --input_layer_shape=-1,-1,-1,3 --output_layer=detection_boxes,detection_scores,num_detections,detection_classes

For more information, please refer to this:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#inspecting-graphs

Wednesday, January 30, 2019

[TFRecord] The easy way to verify your TFRecord file

There is a common situation when you build your TFRecord file ( your dataset ) and want to verify the correctness of the data in it. How to do it? I assume you don't have the problem to build your TFRecord file. So, the easy way to verify your TFRecord file is to use the API: tf.python_io.tf_record_iterator().

[TensorFlow] How to use Distribution Strategy in TensorFlow?

Learned from What’s coming in TensorFlow 2.0, TensorFlow 2.0 is coming soon and there are several features which are ready to use already, for instance, Distribution Strategy. Quoted from the article,
"For large ML training tasks, the Distribution Strategy API makes it easy to distribute and train models on different hardware configurations without changing the model definition. Since TensorFlow provides support for a range of hardware accelerators like CPUs, GPUs, and TPUs, you can enable training workloads to be distributed to single-node/multi-accelerator as well as multi-node/multi-accelerator configurations, including TPU Pods. Although this API supports a variety of cluster configurations, templates to deploy training on Kubernetes clusters in on-prem or cloud environments are provided."

[Shell] The example shell script of automation way to build the software or library required

I don't like to preach at people. But, if someone has a task that has been done more than two times, then he/she should consider using an automation way to ease the burden. One of the solutions is by writing a shell script. Here is an example of building OpenCV 3.4.1 on TX2 using a shell script. The software or the target platform is not the key part. The most important part is to adopt this sort of shell script to become the one of your version.

[Tool] Visdom is a great visualization tool

I cannot use my word to describe Visdom because it is so amazingly awesome. Referencing the introduction from Facebook AI's official site,Visdom is a visualization tool that generates rich visualizations of live data to help researchers and developers stay on top of their scientific experiments that are run on remote servers. Visualizations in Visdom can be viewed in browsers and easily shared with others.

[TensorRT] How to use TensorRT to do inference with TensorFlow model ?

TensorRT is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. Here I am going to demonstrate that how to use TensorRT to do inference with TensorFlow model.

Install TensorRT
Please refer to this official website first:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html#installing
After downloading TensorRT 4.0 ( in my case ), we can install it.

$ dpkg -i nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612_1-1_amd64.deb
$ apt-get update
$ apt-get install tensorrt
$ apt-get install python-libnvinfer-dev
$ apt-get install uff-converter-tf

[TensorFlow] How to write op with gradient in python?

Recently for some reasons, I studied the Domain-Adversarial Training of Neural Networks and it can be downloaded from http://jmlr.org/papers/volume17/15-239/15-239.pdf

In this paper, there is the key point that we should implement "Gradient Reversal Layer" for Discriminator to use it to connect the feature extractor. I found the source to implement it by replacing Identity op's gradient function as follows:

[TensorFlow] How to generate the Memory Report from Grappler?

In the previous post, I introduce the way to generate cost and model report from Grappler.
https://danny270degree.blogspot.com/2019/01/tensorflow-how-to-generate-cost-and.html
In this post, I will continue to introduce the memory report which I think that is very useful. Please refer to my previous post to get the model code.

[TensorFlow] How to generate the Cost and Model Report from Grappler?

General speaking, Grappler in Tensorflow has several optimizers to do the specific area optimizations, such as for reducing the peak memory usage in GPU. So, I want to introduce some useful functions inside Grappler which are used for Simple Placer mechanism. And, these functions are also partially used in Grappler's optimizers.

Thursday, October 24, 2019

Monday, October 7, 2019

Thursday, September 19, 2019

Wednesday, August 21, 2019

Tuesday, August 20, 2019

Monday, August 5, 2019

Tuesday, July 30, 2019

Sunday, July 21, 2019

Thursday, July 18, 2019

Wednesday, July 17, 2019

Monday, July 15, 2019

Qt quick WebGL

Wednesday, June 19, 2019

Wednesday, June 12, 2019

Monday, June 10, 2019

The following steps are about installing and setup Linux proxy server (Squid)

Tuesday, May 28, 2019

Monday, April 22, 2019

Monday, April 15, 2019

Wednesday, April 10, 2019

Wednesday, March 27, 2019

Thursday, March 21, 2019

Friday, March 15, 2019

Monday, March 11, 2019

Wednesday, March 6, 2019

Tuesday, March 5, 2019

Monday, February 25, 2019

Wednesday, January 30, 2019

Wednesday, January 16, 2019

Friday, January 11, 2019

Thursday, January 10, 2019

Monday, January 7, 2019

Friday, January 4, 2019

Thursday, January 3, 2019