The sequence of policy iterations obtained when solving this
problem. The plot on the left is the number of cars to ship from
site A to site B when we have (na,nb) cars at each site at the end
of the day. Each car shipped requires a fixed cost of four dollars
per car and we can transport at most five cars by this method. The
image on the right represents the Boolean action of having our
employee ship one car for us for free. In addition, there is a
fixed cost for having more than ten cars parked over night at either
site. This manifests itself in the policy by shipping cars one way
or another to try to minimize this effect.
The corresponding sequence of state value iterations that result
when solving this problem.
John Weatherwax