Friday, December 14, 2018

[Reinforcement Learning] Get started to learn Sarsa for reinforcement learning

If taking a look at Sarsa algorithm, you will find that it is so similar with Q-Learning.
For my previous post about Q-Learning, please refer to this link:
https://danny270degree.blogspot.com/2018/11/reinforcement-learning-get-started-to_21.html

Here is the Sarsa algorithm:

Thursday, December 13, 2018

[Reinforcement Learning] Using dynamic programming to solve a simple GridWorld with 4X4

I borrow the example and its source code from here which is a dynamic programming to solve a simple GridWorld with 4X4 and put my explanation for the calculation of value function. Hope that will help to understand dynamic programming and Markov Reward Process(MRP) more quickly.

Wednesday, November 21, 2018

Thursday, November 15, 2018

[RNN] What are the difference of input and output's tensor shape in dynamic_rnn and static_rnn using TensorFlow

When studying RNN, my first issue encountered in my program is about the shape of input and output tensors. Shape is a very important information to connect between layers. Here I just directly point out what are differences in input/output shape of static RNN and dynamic RNN.
P.S: If you use Keras to write your RNN model, you won't need to deal with these details.

Tuesday, November 13, 2018

[TensorFlow] The explanation of average gradients by example in data parallelism

When studying some examples of training model using Multi-GPUs ( in data parallelism ), the average gradients function always exists in some kind of ways, and here is a simple version as follows: