Danny's tech notebook | 丹尼技術手札

Wednesday, January 16, 2019

[TensorFlow] How to use Distribution Strategy in TensorFlow?

Learned from What’s coming in TensorFlow 2.0, TensorFlow 2.0 is coming soon and there are several features which are ready to use already, for instance, Distribution Strategy. Quoted from the article,
"For large ML training tasks, the Distribution Strategy API makes it easy to distribute and train models on different hardware configurations without changing the model definition. Since TensorFlow provides support for a range of hardware accelerators like CPUs, GPUs, and TPUs, you can enable training workloads to be distributed to single-node/multi-accelerator as well as multi-node/multi-accelerator configurations, including TPU Pods. Although this API supports a variety of cluster configurations, templates to deploy training on Kubernetes clusters in on-prem or cloud environments are provided."

[Shell] The example shell script of automation way to build the software or library required

I don't like to preach at people. But, if someone has a task that has been done more than two times, then he/she should consider using an automation way to ease the burden. One of the solutions is by writing a shell script. Here is an example of building OpenCV 3.4.1 on TX2 using a shell script. The software or the target platform is not the key part. The most important part is to adopt this sort of shell script to become the one of your version.

[Tool] Visdom is a great visualization tool

I cannot use my word to describe Visdom because it is so amazingly awesome. Referencing the introduction from Facebook AI's official site,Visdom is a visualization tool that generates rich visualizations of live data to help researchers and developers stay on top of their scientific experiments that are run on remote servers. Visualizations in Visdom can be viewed in browsers and easily shared with others.

[TensorRT] How to use TensorRT to do inference with TensorFlow model ?

TensorRT is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. Here I am going to demonstrate that how to use TensorRT to do inference with TensorFlow model.

Install TensorRT
Please refer to this official website first:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html#installing
After downloading TensorRT 4.0 ( in my case ), we can install it.

$ dpkg -i nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612_1-1_amd64.deb
$ apt-get update
$ apt-get install tensorrt
$ apt-get install python-libnvinfer-dev
$ apt-get install uff-converter-tf

[TensorFlow] How to write op with gradient in python?

Recently for some reasons, I studied the Domain-Adversarial Training of Neural Networks and it can be downloaded from http://jmlr.org/papers/volume17/15-239/15-239.pdf

In this paper, there is the key point that we should implement "Gradient Reversal Layer" for Discriminator to use it to connect the feature extractor. I found the source to implement it by replacing Identity op's gradient function as follows:

[TensorFlow] How to generate the Memory Report from Grappler?

In the previous post, I introduce the way to generate cost and model report from Grappler.
https://danny270degree.blogspot.com/2019/01/tensorflow-how-to-generate-cost-and.html
In this post, I will continue to introduce the memory report which I think that is very useful. Please refer to my previous post to get the model code.

[TensorFlow] How to generate the Cost and Model Report from Grappler?

General speaking, Grappler in Tensorflow has several optimizers to do the specific area optimizations, such as for reducing the peak memory usage in GPU. So, I want to introduce some useful functions inside Grappler which are used for Simple Placer mechanism. And, these functions are also partially used in Grappler's optimizers.

[TensorFlow] My example of using SavedModelBuilder to do inference in TensorFlow

The purpose of this post is to show my example of SavedModelBuilder to do inference in TensorFlow. From my experiment, this approach can save a model with the signature that has input and output node name. And SavedModelBuilder can restore the graph based on the previously saved model pb file and the signature definition. Once, the restore is done, the inference task can be executed directly without GPU device needed if the training task is on GPU device.

[Reinforcement Learning] Get started to learn Actor Critic for reinforcement learning

Actor-Critic is basically combined with Policy Gradient (Actor) and Function Approximation (Critic) based algorithm together. Actor is based on the probability given by policy to act and Critic judges the performance of Actor and gives the score. So, Actor will improve its probability given by policy based on Critic's judge and score. The following diagram is the concept:

[Reinforcement Learning] Get started to learn Sarsa(lambda λ) for reinforcement learning

Once you know what the Sarsa algorithm is, you can continue to learn Sarsa(lambda λ) algorithm.
I basically refer to these tutorial documents (written in Chinese) :
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/3-3-A-sarsa-lambda/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/3-3-tabular-sarsa-lambda/
https://zhuanlan.zhihu.com/p/28108498

The Sarsa(lambda λ) algorithm looks like this:

[Reinforcement Learning] Get started to learn Sarsa for reinforcement learning

If taking a look at Sarsa algorithm, you will find that it is so similar with Q-Learning.
For my previous post about Q-Learning, please refer to this link:
https://danny270degree.blogspot.com/2018/11/reinforcement-learning-get-started-to_21.html

Here is the Sarsa algorithm:

[Reinforcement Learning] Using dynamic programming to solve a simple GridWorld with 4X4

I borrow the example and its source code from here which is a dynamic programming to solve a simple GridWorld with 4X4 and put my explanation for the calculation of value function. Hope that will help to understand dynamic programming and Markov Reward Process(MRP) more quickly.

[Reinforcement Learning] Get started to learn DQN for reinforcement learning

The previous post about Q-Learning is here:
[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning

Basically, Deep Q-Learning ( DQN ) is upgraded the Q-Learning algorithm and the Q-table is replaced by the neural network. For the DQN tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-A-DQN/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-DQN1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-2-DQN2/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-3-DQN3/

[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning

The previous post about reinforcement learning:
[Reinforcement Learning] Get started to learn gradient method for reinforcement learning

For the Q-Learning tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-A-q-learning/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-tabular-q1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-3-tabular-q2/

[Reinforcement Learning] Get started to learn policy gradient method for reinforcement learning

This post is about my first time to learn policy gradient method for reinforcement learning. Basically, there are already a lot of materials on the internet, but in this time, I only want to focus on a tutorial as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-1-policy-gradient-softmax1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-2-policy-gradient-softmax2/

[RNN] What are the difference of input and output's tensor shape in dynamic_rnn and static_rnn using TensorFlow

When studying RNN, my first issue encountered in my program is about the shape of input and output tensors. Shape is a very important information to connect between layers. Here I just directly point out what are differences in input/output shape of static RNN and dynamic RNN.
P.S: If you use Keras to write your RNN model, you won't need to deal with these details.

[TensorFlow] The explanation of average gradients by example in data parallelism

When studying some examples of training model using Multi-GPUs ( in data parallelism ), the average gradients function always exists in some kind of ways, and here is a simple version as follows:

[Dynamic Control Flow] Whitepaper: Implementation of Control Flow in TensorFlow

In the following whitepaper, we can understand more dynamic control flow in details.
Whitepaper: Implementation of Control Flow in TensorFlow
http://download.tensorflow.org/paper/white_paper_tf_control_flow_implementation_2017_11_1.pdf

[TensorFlow] Train in Tensorflow and do inference with the trained model

If you want to train your model in Tensorflow and do inference with the trained model, you can refer to this post.

1. Train your model

I will use the simple CNN model in my previous post:
[ONNX] Train in Tensorflow and export to ONNX (Part II)
https://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to_20.html

So, after training, you will get these files:

my_mnist/

├── checkpoint

├── graph.pbtxt

├── my_mnist_model.data-00000-of-00001

├── my_mnist_model.index

└── my_mnist_model.meta

[LLVM] LLVM studying list for newbie

If you are an LLVM newbie and are interested in LLVM like me, you may take a look at my LLVM studying list. It takes time for me to search the related resources and documents. So, I think it will help somehow. By the way, most of my list items are written in Chinese so that those who are native Engish speakers may not suit for this.

[TensorFlow] Does it help the processing time and transmission time if increasing CUDA Steam number in TensorFlow?

Before starting to increase the CUDA Steam number in TensorFlow, I want to recap some ideas about the Executor module. When TensorFlow session runs, it will build Executor. Meanwhile, if you enable CUDA in TensorFlow build configuration, the Executor will add visible GPU devices and create TF device object (GPUDevice object) mapping to physical GPU device. There are 4 kinds of streams inside GPUDevice:

CUDA stream
Host_to_Device stream
Device_to_Host stream
Device_to_Device stream

[TensorFlow Grappler] How to do the topological sorting in TensorFlow Grappler?

If you try to implement some optimizers in TensorFlow Grappler, you must have to know how to deal with the directed computation graph. One of the most important tools/knowledges is topological sorting.
The definition from Wiki: Topological sorting
https://en.wikipedia.org/wiki/Topological_sorting
"In the field of computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering."

[Tool] To draw a sequence diagram using online tool sequencediagram

This website provides an online free tool for users to draw the sequence diagram as follows:
https://sequencediagram.org/

Basically, you can follow the instructions at the left top corner button. Check it out.
Here is my example of the sequence diagram about tracing some source codes of XLA AOT in TensorFlow.

Wednesday, January 16, 2019

Friday, January 11, 2019

Thursday, January 10, 2019

Monday, January 7, 2019

Friday, January 4, 2019

Thursday, January 3, 2019

Monday, December 24, 2018

Saturday, December 22, 2018

Monday, December 17, 2018

Friday, December 14, 2018

Thursday, December 13, 2018

Wednesday, November 28, 2018

Thursday, November 22, 2018

Wednesday, November 21, 2018

Thursday, November 15, 2018

Tuesday, November 13, 2018

Thursday, November 8, 2018

Tuesday, October 30, 2018

1. Train your model

Wednesday, October 24, 2018

Tuesday, October 23, 2018

Thursday, October 18, 2018