Reference no: EM132731584
BUST10133 Decision Analytics - University of Edinburgh
SECTION A
1. a) i) Describe the properties of an absorbing state in a Markov chain.
ii) Explain what the absorption probability means for an absorbing state in a Markov chain.
Consider the following Markov chain model of consumers' purchasing behaviour of groceries from five supermarkets A, B, C, D and E. The state represents the supermarket. The set of possible states is {A, B, C, D, E}. The transition matrix models the change of consumers' preferences in supermarket during any month.
b) Consider customers preferring supermarket B. Calculate the probability that these customers switch to supermarket D or E three months later.
c) Calculate the probability that customers ever switch to supermarket D from supermarket B before reaching supermarket E.
d) Calculate the expected time until the customers switch from supermarket B to D before reaching supermarket E.
e) For some reasons, supermarket E is closed. Customers preferring supermarket E have a 50% chance of switching to supermarket C and a 50% chance of switching to supermarket D.
Modify the Markov chain model and find the long-run probability distribution of the state of the process at the beginning of a month.
2. a) Explain why the maximum expected total reward is unsuitable as a decision- making criterion in an infinite horizon Markov decision process, and how to overcome this.
Consider the following Markov decision process model of the sales of a product. The state of the process is the classification of sales performance at the beginning of a month--- excellent (e), good (g) or poor (p). Hence the set of possible states is S = {e, g, p}. The decision is the promotion for this month --- either nothing (n), direct marketing (d), or advertising (a). Hence, the set of possible decisions in state i is Ki =
{n, d, a} for all i in S. The following table shows rki, the expected profit (in £'000s) during a month when the process is in state i and action k is chosen, and pki,j the probability that the process makes a transition to state j when the process is in state i and action k is chosen.
|
i
|
k
|
rki
|
pki,e
|
pki,g
|
pki,p
|
|
e
|
n
|
100
|
0.5
|
0.5
|
0
|
|
e
|
d
|
80
|
0.7
|
0.3
|
0
|
|
e
|
a
|
50
|
1
|
0
|
0
|
|
g
|
n
|
60
|
0
|
0.6
|
0.4
|
|
g
|
d
|
40
|
0.5
|
0.5
|
0
|
|
g
|
a
|
10
|
0.8
|
0.2
|
0
|
|
p
|
n
|
10
|
0
|
0
|
1
|
|
p
|
d
|
-10
|
0
|
0.4
|
0.6
|
|
p
|
a
|
-30
|
0.3
|
0.7
|
0
|
The objective is to maximise the infinite horizon expected discounted reward with a discount factor of 0.8 per month.
b) Use policy iteration to determine whether the promotion decision that does nothing when the sales performance is excellent, does direct marketing when the sales performance is good, and does advertising when the sales performance is poor is optimal.
c) i) By how much can rap change without affecting the conclusion from (b)?
ii) Suppose that pde,e = q and pde,g = 1-q. Within what range of q does the conclusion from (b) still hold?
d) Suppose that it takes one month to organise a special promotion, meaning that the decision made at the beginning of a month is the special promotion to use in the next month, rather than this month. Explain how you would modify the Markov decision
process model above for this situation. Explain how the state, action, immediate reward, and transition probability will change.
SECTION B
3. The Edinburgh Cashmere Company sells cashmere scarfs. The company has found that when the winter is very cold they sell 10000 scarfs; when the winter is average they sell 7000 scarfs, and when the winter is mild they sell 5000 scarfs. Each scarf sells for £50 and costs £35 to make. Any unsold scarfs during the season are sold to a discount chain at half price (£25). The company has learnt a forecast for the coming winter, saying that there was a 30% chance that the winter would be very cold, a 40% chance that it would be average, and a 30% chance that it would be mild.
a) Under the Hurwicz decision criterion with the coefficient of optimism ?? = 0.6, how many scarfs should the company make to maximise profits?
Following the above information, the Edinburgh Cashmere Company can choose to do sales promotion A or promotion B when the winter is average or mild. By adopting Promotion A, the company can sell additional 2000 scarfs. And by adopting Promotion B, the company can sell additional 3000 scarfs. Promotion A costs £15000, and Promotion B costs £30000. From past experiences, Promotion A has a 35% success rate, and Promotion B has a 50% success rate.
b) Draw a decision tree that models this problem and use it to determine the policy that the company should adopt to maximise expected monetary value.
c) What is the expected value of perfect information on the winter weather?
d) Suppose the probability of promotion B being successful is unknown. Determine the range of this probability within which the policy determined in (b) remains optimal.
4. a) Give three reasons why the decision tree approach is not an efficient way to solve a sequential sampling or bandit problem.
A manufacturer produces mugs in batches of 1,000 which can either be accepted or rejected. If the batch is accepted, the manufacturer can make a profit of £5,000. If the batch is rejected, the manufacturer makes a loss of £3,000. The manufacturer estimates that the probability that the batch is accepted is 80%.
Before releasing this batch of mugs to customers, the manufacturer can either perform up to two independent tests on the sample mugs from this batch or rework this batch.
Reworking a batch involves examining every mug and replacing all faulty mugs with conforming ones so that the batch will definitely be accepted. The estimated cost of reworking a batch is £1000.
Performing a test costs £200, but the test does not give perfect information. Experience has shown that the probability the test returns a positive result given that the batch is good is 0.9, and the probability the test returns a negative result given that a batch is bad is 0.8.
b) Model this situation as a sequential sampling problem and find the functional equations for the problem.
c) Use the model from (b) to determine a testing policy the manufactory should adopt in order to maximise its expected profit.
d) Calculate the expected value of perfect information (EVPI) for the problem.
e) Calculate the risk profile for the policy determined in (c).
Attachment:- Decision Analytics.rar