The supply of a certain good is inspected periodically. If an order is placed of size x>0 (integer), the ordering costs are 8+2.x. The delivery time is zero. The demand is stochastic en equals 1 or 2 with probability ½ . Demand in subsequent periods are independent. The size of an order must be such that (a) demand in a period is always satisfied, and (b) the stock at the end of a period never exceeds 2. The holding costs in a period are 2 per unit remaining at the end of a period. Target is to minimize the expected discounted costs over infinite horizon, use discount factor 0.8.
(a) Give the optimality equations for the Markov decision problem.
(b) Give an LP-model that allows you to determine the optimal policy.
(c) Carry out two iterations of the value iteration algorithm
(d) Choose an odering policy, and investigate using the policy iteration algorithm whether or not this policy is optimal. "