Danny's tech notebook | 丹尼技術手札

Wednesday, November 28, 2018

[Reinforcement Learning] Get started to learn DQN for reinforcement learning

The previous post about Q-Learning is here:
[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning

Basically, Deep Q-Learning ( DQN ) is upgraded the Q-Learning algorithm and the Q-table is replaced by the neural network. For the DQN tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-A-DQN/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-DQN1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-2-DQN2/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-3-DQN3/

[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning

The previous post about reinforcement learning:
[Reinforcement Learning] Get started to learn gradient method for reinforcement learning

For the Q-Learning tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-A-q-learning/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-tabular-q1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-3-tabular-q2/

[Reinforcement Learning] Get started to learn policy gradient method for reinforcement learning

This post is about my first time to learn policy gradient method for reinforcement learning. Basically, there are already a lot of materials on the internet, but in this time, I only want to focus on a tutorial as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-1-policy-gradient-softmax1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-2-policy-gradient-softmax2/

[RNN] What are the difference of input and output's tensor shape in dynamic_rnn and static_rnn using TensorFlow

When studying RNN, my first issue encountered in my program is about the shape of input and output tensors. Shape is a very important information to connect between layers. Here I just directly point out what are differences in input/output shape of static RNN and dynamic RNN.
P.S: If you use Keras to write your RNN model, you won't need to deal with these details.

[TensorFlow] The explanation of average gradients by example in data parallelism

When studying some examples of training model using Multi-GPUs ( in data parallelism ), the average gradients function always exists in some kind of ways, and here is a simple version as follows:

[Dynamic Control Flow] Whitepaper: Implementation of Control Flow in TensorFlow

In the following whitepaper, we can understand more dynamic control flow in details.
Whitepaper: Implementation of Control Flow in TensorFlow
http://download.tensorflow.org/paper/white_paper_tf_control_flow_implementation_2017_11_1.pdf

[TensorFlow] Train in Tensorflow and do inference with the trained model

If you want to train your model in Tensorflow and do inference with the trained model, you can refer to this post.

1. Train your model

I will use the simple CNN model in my previous post:
[ONNX] Train in Tensorflow and export to ONNX (Part II)
https://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to_20.html

So, after training, you will get these files:

my_mnist/

├── checkpoint

├── graph.pbtxt

├── my_mnist_model.data-00000-of-00001

├── my_mnist_model.index

└── my_mnist_model.meta

[LLVM] LLVM studying list for newbie

If you are an LLVM newbie and are interested in LLVM like me, you may take a look at my LLVM studying list. It takes time for me to search the related resources and documents. So, I think it will help somehow. By the way, most of my list items are written in Chinese so that those who are native Engish speakers may not suit for this.

[TensorFlow] Does it help the processing time and transmission time if increasing CUDA Steam number in TensorFlow?

Before starting to increase the CUDA Steam number in TensorFlow, I want to recap some ideas about the Executor module. When TensorFlow session runs, it will build Executor. Meanwhile, if you enable CUDA in TensorFlow build configuration, the Executor will add visible GPU devices and create TF device object (GPUDevice object) mapping to physical GPU device. There are 4 kinds of streams inside GPUDevice:

CUDA stream
Host_to_Device stream
Device_to_Host stream
Device_to_Device stream

[TensorFlow Grappler] How to do the topological sorting in TensorFlow Grappler?

If you try to implement some optimizers in TensorFlow Grappler, you must have to know how to deal with the directed computation graph. One of the most important tools/knowledges is topological sorting.
The definition from Wiki: Topological sorting
https://en.wikipedia.org/wiki/Topological_sorting
"In the field of computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering."

[Tool] To draw a sequence diagram using online tool sequencediagram

This website provides an online free tool for users to draw the sequence diagram as follows:
https://sequencediagram.org/

Basically, you can follow the instructions at the left top corner button. Check it out.
Here is my example of the sequence diagram about tracing some source codes of XLA AOT in TensorFlow.

[TensorFlow Grappler] The ways to traverse all nodes' input and output in the graph using C++ in TensorFlow Grappler

Here I want to introduce 2 ways to traverse all nodes' input and output in the graph using C++ in Grappler.
P.S: you have to be able to get GrapplerItem and GraphDef objects in your code.

First, check my example node name in Tensorboard as follows:
conv1/Conv2D

[NUMACTL] How to use numactl in practice?

I recently attended the Intel AI workshop and they gave an advice of using NUMACTL to improve the performance of training and inferencing in Deep Learning with Intel Caffe. Here I post some related information as follows:

[XLA 研究] How to use XLA AOT compilation in TensorFlow ( Part II )

My previous post: [XLA 研究] How to use XLA AOT compilation in TensorFlow is about a simple example to use XLA AOT. But, if you want to see a more complicated example, please take a look at this: https://gist.github.com/carlthome/6ae8a570e21069c60708017e3f96c9fd

[TFLMS] Large Model Support in TensorFlow by Graph Rewriting

This post just introduces this paper "Large Model Support in TensorFlow by Graph Rewriting" and it is published as a pull request in the TensorFlow repository for contributing to the TensorFlow community. With TFLMS, we were able to train ResNet-50 and 3DUnet with 4.7x and 2x larger batch size, respectively. Quite amazing...

[TensorFlow] Why does the feed's shape matter in TensorFlow Grappler?

Before explaining this, you should understand what Shapes and dynamic dimensions are in TensorFlow. This article below explains the concept very well.
https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:

[XLA related] Sort out my thought and notes about XLA related

This post could be a little bit unstructured because it's for my reference in notes.
I recently found that there are several slides in SlideShare which contain very good information and source code analysis/study about XLA related as follows:

[TensorFlow] My simple way to profile TensorFlow and dump variables and GPU memory

As we know that if we want to profile Tensorflow graph and know what operations take more time and what less. This can be done with Tensorflow timeline module like this:
( I ignore the part of the model to simplify my example code )

...
run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
run_metadata = tf.RunMetadata()
...
with tf.Session(config=config) as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(10):
            sess.run(training_op, feed_dict={X: picture, y:picture_label}, 
                     options=run_options, run_metadata=run_metadata)
            fetched_timeline = timeline.Timeline(run_metadata.step_stats)
            chrome_trace = fetched_timeline.generate_chrome_trace_format()
            with open('timeline_step_%d.json' % iteration, 'w') as f:
                f.write(chrome_trace)

[ONNX] Train in Tensorflow and export to ONNX (Part II)

If you read the previous post as the link below, you probably may ask a question: If the input TF graph for freezing is not a binary format, what do we do?
http://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to.html

Let us recall the previous example below. The file "graph.proto" is the binary format of the protobuf file for TensorFlow graph generated from the following function:

  with open("graph.proto", "wb") as file:
    graph = tf.get_default_graph().as_graph_def(add_shapes=True)
    file.write(graph.SerializeToString())

[TensorFlow] Rewriter_Config and Memory Optimization Passes

In the previous post as the below link, I mentioned that the default value of rewrite_config seems to change a little bit.
https://danny270degree.blogspot.com/2018/06/tensorflow-compare-memory-options-in.html

To clarify my doubt, I check the TensorFlow's memory_optimizer.cc and arrange the mapping table:

[TensorFlow] How to print the timestamp of a node/operation of computation graph in run-time?

When some people first time tries to debug or print out information of some result from a node/operation in the computation graph in TensorFlow, they maybe confuse about how to do it. Fortunately, someone in Google gave a great explanation of the print function:
https://towardsdatascience.com/using-tf-print-in-tensorflow-aa26e1cff11e
After reading it, you should understand how tf.Print() function works and to use it.

[ONNX] Use ONNX_TF and nGraph_ONNX to do inference/prediction with ONNX model

Here I try to use the pre-trained model from ONNX model zoo, which the models are already converted from some deep learning framework. So I download the Resnet50 model from the following URL and untar it:

wget https://s3.amazonaws.com/download.onnx/models/opset_8/resnet50.tar.gz
tar -xzvf resnet50.tar.gz

P.S: pre-trained ONNX models: https://github.com/onnx/models

Then, I can do the inference/prediction using this ONNX model in two ways:

[ONNX] Train in Tensorflow and export to ONNX (Part I)

From my point of view, ONNX is a model description spec and ONNX model needs Deep Learning framework or backend tool/compiler which supports it to run.
The advantage of ONNX as I know is about portable and exchangeable between DL frameworks.
Here I will use this tutorial to convert TensorFlow's model to ONNX model by myself.

https://github.com/onnx/tutorials/blob/master/tutorials/OnnxTensorflowExport.ipynb

Wednesday, November 28, 2018

Thursday, November 22, 2018

Wednesday, November 21, 2018

Thursday, November 15, 2018

Tuesday, November 13, 2018

Thursday, November 8, 2018

Tuesday, October 30, 2018

1. Train your model

Wednesday, October 24, 2018

Tuesday, October 23, 2018

Thursday, October 18, 2018

Wednesday, October 17, 2018

Tuesday, October 2, 2018

Tuesday, September 18, 2018

Monday, September 17, 2018

Friday, September 7, 2018

Tuesday, September 4, 2018

Wednesday, August 29, 2018

Tuesday, August 21, 2018

Friday, August 17, 2018

Thursday, August 16, 2018

Wednesday, August 8, 2018