Chapter 6 in Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto.

The estimates learned using TD(0) for various numbers of iterations when the code mk_fig_6_6.m is run. This code effectively duplicates that presented in Figure 6.6 from the book.

The learning curves for TD(0) and constant alpha Monte Carlo learning algorithms for the predictions on the random walk problem. These plots are produced when the code mk_arms_error_plt.m is run.

John Weatherwax

Last modified: Sun May 15 08:46:34 EDT 2005