I basically refer to these tutorial documents (written in Chinese) :
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/3-3-A-sarsa-lambda/
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/3-3-tabular-sarsa-lambda/
https://zhuanlan.zhihu.com/p/28108498
The Sarsa(lambda λ) algorithm looks like this:
I think the main difference between Sarsa and Sarsa(lambda) are two parts:
1. Lambda (from 0 to 1):
- If lambda = 0, Sarsa(lambda) is Sarsa because it only updates the last step that gains rewards.
- If lambda = 1, Sarsa(lambda) will update all the historical steps and the last step that gains reward.
- If lambda = 0.5, Sarsa(lambda) will update around half of the historical steps and the last step that gains reward.
2. Eligibility Trace:
I make a diagram to briefly explain Sarsa(lambda λ) algorithm and it can be compared with my previous post about Sarsa algorithm:
[Reinforcement Learning] Get started to learn Sarsa for reinforcement learning
1 comment:
Great, thanks for sharing this post.Much thanks again. Awesome.
machine learning training in hyderabad
machine learning course in hyderabad
Post a Comment