I borrow the example and its source code from here which is a dynamic programming to solve a simple GridWorld with 4X4 and put my explanation for the calculation of value function. Hope that will help to understand dynamic programming and Markov Reward Process(MRP) more quickly.
Thursday, December 13, 2018
Wednesday, November 28, 2018
[Reinforcement Learning] Get started to learn DQN for reinforcement learning
The previous post about Q-Learning is here:
[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning
Basically, Deep Q-Learning ( DQN ) is upgraded the Q-Learning algorithm and the Q-table is replaced by the neural network. For the DQN tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-A-DQN/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-DQN1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-2-DQN2/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-3-DQN3/
[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning
Basically, Deep Q-Learning ( DQN ) is upgraded the Q-Learning algorithm and the Q-table is replaced by the neural network. For the DQN tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-A-DQN/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-DQN1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-2-DQN2/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-3-DQN3/
Thursday, November 22, 2018
[Reinforcement Learning] Get started to learn Q-Learning for reinforcement learning
The previous post about reinforcement learning:
[Reinforcement Learning] Get started to learn gradient method for reinforcement learning
For the Q-Learning tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-A-q-learning/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-tabular-q1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-3-tabular-q2/
[Reinforcement Learning] Get started to learn gradient method for reinforcement learning
For the Q-Learning tutorial, I refer to these as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-A-q-learning/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-2-tabular-q1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/2-3-tabular-q2/
Wednesday, November 21, 2018
[Reinforcement Learning] Get started to learn policy gradient method for reinforcement learning
This post is about my first time to learn policy gradient method for reinforcement learning. Basically, there are already a lot of materials on the internet, but in this time, I only want to focus on a tutorial as follows: ( sorry, they are written in Chinese )
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-1-policy-gradient-softmax1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-2-policy-gradient-softmax2/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-1-policy-gradient-softmax1/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/5-2-policy-gradient-softmax2/
Thursday, November 15, 2018
[RNN] What are the difference of input and output's tensor shape in dynamic_rnn and static_rnn using TensorFlow
When studying RNN, my first issue encountered in my program is about the shape of input and output tensors. Shape is a very important information to connect between layers. Here I just directly point out what are differences in input/output shape of static RNN and dynamic RNN.
P.S: If you use Keras to write your RNN model, you won't need to deal with these details.
P.S: If you use Keras to write your RNN model, you won't need to deal with these details.
Tuesday, November 13, 2018
[TensorFlow] The explanation of average gradients by example in data parallelism
When studying some examples of training model using Multi-GPUs ( in data parallelism ), the average gradients function always exists in some kind of ways, and here is a simple version as follows:
Thursday, November 8, 2018
[Dynamic Control Flow] Whitepaper: Implementation of Control Flow in TensorFlow
In the following whitepaper, we can understand more dynamic control flow in details.
Whitepaper: Implementation of Control Flow in TensorFlow
http://download.tensorflow.org/paper/white_paper_tf_control_flow_implementation_2017_11_1.pdf
Whitepaper: Implementation of Control Flow in TensorFlow
http://download.tensorflow.org/paper/white_paper_tf_control_flow_implementation_2017_11_1.pdf
Tuesday, October 30, 2018
[TensorFlow] Train in Tensorflow and do inference with the trained model
If you want to train your model in Tensorflow and do inference with the trained model, you can refer to this post.
[ONNX] Train in Tensorflow and export to ONNX (Part II)
https://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to_20.html
So, after training, you will get these files:
1. Train your model
I will use the simple CNN model in my previous post:[ONNX] Train in Tensorflow and export to ONNX (Part II)
https://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to_20.html
So, after training, you will get these files:
my_mnist/
├── checkpoint
├── graph.pbtxt
├── my_mnist_model.data-00000-of-00001
├── my_mnist_model.index
└── my_mnist_model.meta
Subscribe to:
Posts (Atom)