The sequence of policy iterations obtained when solving this problem.
The corresponding sequence of state value function obtained when solving this problem.
John Weatherwax