Reference no: EM133184961
Data Analytics
Level 4
Below are descriptions and the analyses that you should undertake for the datasets that are provided the excel file.
D1 (Measures Of Central Tendecies & Dispersion)
The prices of a sample of 260 houses, in the southern as well as the northern areas of a town.
Analyse the data using the examples in our weeks 1 - 3 learning. Specifically look at the additional weekly tasks. For each side of the town undertake the following;
• Draw an ogive that can be used to estimate all the three quartiles.
• Calculate the mean, Standard deviation, and the coefficient of variation.
• Compare measures of central tendencies and the dispersions for the two areas.
• Calculate (2 d.p.) all the three quartiles.
• Calculate the percentage likelihood of a prospective house buyer with a budget of only
£153,150 to buy a house in each of the two areas.
• The budget needed by a prospective house buyer, if he or she has a 65% likelihood to purchase a house in any of the two areas.
D2 (Measures Of Dispersion)
The table shows the money collected by ten charities in twenty city centres in U.K. when passer- by were asked to donate the excess coins in their wallets. Using excel, undertake the following analysis
For each city and each charity, calculate the three averages (Mean, mode, median); the measures of dispersion (range; the quartiles; interquartile range, variance, standard deviation); and the coefficient of variation (CV).
Put all the donations together and use the excel "frequency function" to construct a grouped frequency table with 9 class intervals. The class intervals should be of equal width. HINT: find the minimum value and use that to determine the LCB of the first class interval; and the maximum value and use that to determine the UCB of the last class interval. Also from the range, you can determine the width of each class interval. Also look in the module handbook on how to group frequency tables.
From the grouped frequency table undertake the following
• Calculate the mean, Standard deviation, and the coefficient of variation.
• Draw a histogram and comment on the shape of the diagram.
D3 (Index Numbers)
The data shows the monthly RPI index for U.K. from 2011 to date.
Your annual salary in 2011 was £21,500. To take care of inflations, your employer been increasing your salary by 2.5% p.a. up to 2020. Determine the yearly RPI for each year by taking the average (mean) of the monthly data. Using this yearly RPI data, index your salary to 2012, and calculate for each year your real earnings. Also for each year, calculate by what percentage increase should your salary be to bring it in line with inflation?
Using a simple linear regression and the first three quarter RPI, estimate the monthly RPI for the last quarter of 2021. Hint: use the months Jan - Dec (1 - 12) as the independent variable and the RPIs as the dependent variable. Another method will be to calculate the average increase and hence
the trend (see e.g. in the Times Series how to calculate the trend). What percentage increase should your employer offer in 2021 to bring your salary in line with inflation and what is this salary?
D4 (Time Series Case Study)
Analyse the given Time Series case study and determine the following,
• The Sum of 4
• The Sum of 2 4s
• The Trend
• The average seasonal variation (SV)
• Based on your analysis, forecast the turnover (£000) for the four quarters of 2022
• how accurate are these forecast? HINT: de-seasonalise the data and use that as part of your discussion on the accuracy of the forecast.
D5. (Simple Linear Regression & Correlation)
The data for this question are for each 27 stores of a grocery supermarket in U.K. The variables (Y, X1 to X5) are as follows
Y = annual net sales (£000s) X1 = number sq. ft. (1000) X2 = inventory (£000s)
X3 = amount spent on advertising (£000s) X4 = size of sales district
X5 = number of competing stores in district
Take Y as the dependent variable, and X1 - X5 as independent variables.
Use excel to undertake the following (NOTE: you are to undertake only simple pairwise analyses of the variables)
• Find Correlation coefficient between the dependent variable and each independent variable.
• Find Correlation coefficient between pairs of the independent variable.
• Undertake calculations to determine the variations in annual net sales that is explained by each independent variable.
• Undertake a simple linear regression to write out the equation for each of the five independent variables.
D6. (Probability Tree)
The data gives the independent probabilities that three machines will produce a faulty component in a factory during the course of the manufacturing. Use the information to draw a probability tree and label the end of the terminal branches as 1 to 8. Calculate the probability at the end of each terminal branch. HINT: the tree is similar to the example in the module handbook (see 9.13 Tutor Example 3 - Working with Decision Trees).
D8 (DECISION TREE)
A company has the opportunity of marketing a new package of computer games. There are two possible courses of action initially: abandon this project and sell the copyright for the games, or test the market for them. It is estimated that the copyright could be sold at this point for £40,000. Market testing would cost £5,000 and the consumer reaction could be positive or negative with current evidence suggests that this reaction is likely to be three-quarters positive. With a positive reaction, the company could either market the games nationally or they could sell the copyright which, with the endorsement of a positive consumer reaction, would then be worth £65,000. A negative reaction would persuade the company to abandon the project. Giving up the project at this point would mean selling the copyright for this package of games for a £15,000 contribution to costs. Research suggests that marketing the games nationally would result in the following probability distribution of sales:
The datasheet provides the drawing of the decision tree with the nodes numbered 1 - 4. Complete the tree with all the relevant probabilities and cash flows. Calculate the EMV at all the nodes.
D9 (FINANCIAL MATHEMATICS - PRESENT VALUE)
A company is considering undertaking six projects (A -F). The data shows the yearly revenues estimates after the completion of the projects. For each project calculate the yearly present value (PV) of the respective revenue. Use a discount rate of 8%.
D10 (DECISION TABLE - EMV)
The owner of a fruit shop in Leeds sells oranges in crates and he needs to decide how many crates of oranges to purchase each week to sell in his shop. He buys each crate for £20 and sells it at £35 per crate. Any oranges left unsold are returned to the wholesaler who refunds 40% of the original costs. The owner is considering a stocking strategy of 4, 5, 6, or 7 crates per week. Table 1 is the Conditional Profit (Payoff) table based on the weekly purchase (stocking strategy) of the oranges he gets, and the number sold (demand). Table 2 is the expected values based on the Conditional Profits and probabilities of selling the oranges. Complete both tables and calculate the expected monetary value (EMV) for each stocking strategy. The probability distribution (Prob.) of customer demands are given in table 1.
Attachment:- Assignment Paper.rar