The sequence of policy iterations obtained when solving this problem. The plot on the left is the number of cars to ship from site A to site B when we have (na,nb) cars at each site at the end of the day. Each car shipped requires a fixed cost of four dollars per car and we can transport at most five cars by this method. The image on the right represents the Boolean action of having our employee ship one car for us for free. In addition, there is a fixed cost for having more than ten cars parked over night at either site. This manifests itself in the policy by shipping cars one way or another to try to minimize this effect.




The corresponding sequence of state value iterations that result when solving this problem.






John Weatherwax