Here you will find replicated learning rate experiments for the mountain car example from this chapter in the book.
These plots compare how fast the linear gradient decent SARSA could solve this problem with various value for
lambda and alpha. As in the book for a single experiment we compute the steps per episode, after
performing 20 episodes, and do this experiment 30 times to average the results.
For the case with replacing traces we find
While for the case with accumulating traces we find that
Last modified: Sun May 15 08:46:34 EDT 2005