Reference no: EM133248083
Assignment:
1. Consider a dice game that is similar to the card game blackjack. The goal of the game is to obtain numbers that sum to as near as possible to 13 without going over. The game starts with rolling a dice by the dealer and player, and each of them has an independent random number from 1 to 6. The player can request additional rolling (hit) until he/she decides to stop (stand) or exceed 13 (bust). After the player stands, the dealer can roll additional dice according to its policy. If neither player nor dealer busts, the outcome (win, lose, draw) is decided by whose sum is closer to 13. The reward for winning is 1 dollar, drawing is 0, and losing is -1. This game can be modeled by the Markov decision process (MDP). Please answer the following questions.
(a) Define the states of MDP for the dice game, and answer the number of states.
(b) Define the actions of MDP for the dice game, and answer the number of actions.
(c) Suppose the player adopts a policy that rolls additional dice until the sum is 10 or greater. Answer the state transition probability matrix.
This game can be modeled by Markov decision process (MDP). Please answer the following questions.
(a) Define the states of MDP for the dice game, and answer the number of states.
(b) Define the actions of MDP for the dice game, and answer the number of actions.
(c) Suppose the player adopts a policy that rolls additional dice until the sum is 10 or greater.
Answer the state transition probability matrix.