Describe data and discussing any interesting features

Assignment Help Basic Statistics
Reference no: EM132583667

Questions -

Q1. A group of senior citizens who have never used the internet before are given training over a period of 6 months. A sample of 3 of them is chosen at random and their numbers of hours of internet use are recorded for the 6 months, as shown in Figure 1.

(i) Describe briefly the data, discussing any interesting features. Based on Figure 1 only suggest the form of a possible linear model of the hours of use per month (as response variable) and month (as explanatory variable).

(ii) Let y be the hours of use per month and x be the month. An analysis in R gave the following output: (see attached file)

(a) Write down the fitted model.

(b) Comment on the model and the quality of its goodness of fit, making appropriate reference to any goodness of fit diagnostics. State clearly any hypothesis you may use.

(c) Using one of the following R extracts

> qnorm(0.95)  > qt(0.95, df=14)

[1] 1.644854  [1] 1.76131

> qt(0.95, df=15)  > qt(0.995, df=15)

[1] 1.75305  [1] 2.946713

calculate 90% confidence intervals for the coefficient of x and for the coefficient of x2.

(d) For month x = 1 calculate a 90% predictive interval for the future observation y. You may use the following:

918_figure.jpg

where X is the design matrix of the linear model.

(e) A further R analysis gave. Calculate the correlation coefficient of the estimator of the gradient (coefficient of x) and the estimator of the coefficient of x2.

Q2. A data-set on black cherry trees in the Allegheny National Forest, Pennsylvania, USA includes the height, radius (measured 4.5 feet above the ground) and volume, for each of 31 trees.

(i) A model vi = β0 + β1ri + β2hi + ∈i (1)

has been proposed, where hi, ri, vi are the natural logarithms of the height (in feet), radius (in feet) and volume (in cubic feet) of the ith tree, and ∈i ~ 1 N(0, σ2) independently for different trees. The following output summarizes the results of fitting this model in R.

Explain the hypothesis being tested by each of the three F statistics included in the output. What interpretation, if any, can be placed on their conclusions here?

(ii) Figure 2 shows the standardized deletion residuals for the model above. The following calculations can be used as the basis of a test on the standardized deletion residuals, using the Sidak correction. >alpha=0.05

> prob=1-(1-alpha)-(1/31)

> qt(prob/2,27)

[1] -3.495321

Explain the interpretation of the values alpha and prob used in the calculation, and carry out the test.

(iii) Thinking about the trunk of each tree as a cylinder, a simple geometric calculation suggests that

Vi ≈ kRi2Hi (2)

where Vi = exp(vi) etc., and that k ≈ π (the usual circular constant). Explain why the model suggested by (2) can be represented as a special case of (1) under the null hypothesis that β1 = 2 and β2 = 1, and explain how that null hypothesis can be written in the general form

Cβ = c.

Express the weaker hypothesis that β1 + β2 = 3 in a similar form, and calculate the corresponding F statistic, using the fact that

1240_figure1.jpg

What is the null distribution of this F statistic?

Q3. (i) A laboratory experiment is intended to investigate the effect of a drug on certain species of micro-organisms. Tissue cultures containing set amounts of one of three species of micro-organisms (A, B, C) are each exposed to doses of the drug being tested; there are four different doses used, and two replicates of each combination of species and dose. Figure 3 shows a plot produced in R of the dose and response for each run, the points being coded by species.

Various models are being considered for the response as a function of species and dose. The output below shows summaries of results for two models; Response and Species have the obvious meaning, NumDose refers to the dose as a quantitative variable, and FacDose refers to the dose as a factor variable.

(a) Give the equations for these two models, explaining your notation and assumptions.

(b) Calculate the BIC for each of these two models. Based on the BIC, explain which of the two models you would prefer and why.

(c) What advantages and disadvantages do these two modelling approaches-dose as a factor, and dose as a numerical variable-have for this experiment, beyond those taken into account in the BIC?

(ii) Consider the linear model

yi = xiTβ + ∈i,  i = 1, 2, . . . , n, (3)

where ∈i is an i.i.d. sequence of random variables with zero mean and variance Var(∈i) = σ2ci, for some variance σ2 and ci > 0.

Discounted least squares considers the maximum likelihood estimator β^ of β, which minimises the discounted sum of squares

Sδ(β) = i=1nδn-i(yi - xiTβ)2,

for some discount factor δ that satisfies 0 < δ ≤ 1.

(a) Show that discounted least squares is a special case of weighted least squares (WLS) and calculate the weights of WLS as functions of δ.

(b) Using the relationship of discounted least squares and WLS as in (a), derive the variance of ∈i as a function of σ2 and δ.

(c) For the simple linear regression model with no intercept and a near constant covariate xi ≈ x, i.e.

yi ≈ xβ + ∈i,  

show that

β^ = ((1- δ)/x(1- δn))i=1nδn-1yi.

Attachment:- Statistics Assignment File.rar

Reference no: EM132583667

Questions Cloud

Attract the support of a senior manager : Provide THREE (3) important benefits of a policy or procedural change that could attract the support of a senior manager.
Nervous about giving speech : Someone has come to you for advice about the best way to deliver their speech, but they are very nervous about giving a speech
Which traits do you believe will inspire others : Provide a 500 word summary that further explains your leadership philosophy based on leadership models and theories of your choosing.
What are the required steps training for staff to manage : What are the required steps training for staff to manage volunteers, distinguish roles, and facilitate social interactions.
Describe data and discussing any interesting features : Describe briefly the data, discussing any interesting features. Based on Figure 1 only suggest the form of a possible linear model of the hours of use
Discuss the etiology : Discuss the etiology. What can cause this injury/condition to occur? How is this injury/condition treated?
What types of reports and data would a manager use : What types of reports and data would a manager use to form an historical point of view of the store's/company's performance?
Approaches surrounding collection and analysis of data : Discuss the differences between the three major approaches surrounding collection and analysis of data, i,e., quantitative, qualitative, and mixed methods
Determine how does exercising christian principles play : Determine How does exercising Christian principles play a part in running a successful business while operating within state and federal regulations?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Statistics-probability assignment

MATH1550H: Assignment:  Question:  A word is selected at random from the following poem of Persian poet and mathematician Omar Khayyam (1048-1131), translated by English poet Edward Fitzgerald (1808-1883). Find the expected value of the length of th..

  What is the least number

MATH1550H: Assignment:  Question:     what is the least number of applicants that should be interviewed so as to have at least 50% chance of finding one such secretary?

  Determine the value of k

MATH1550H: Assignment:  Question:     Experience shows that X, the number of customers entering a post office during any period of time t, is a random variable the probability mass function of which is of the form

  What is the probability

MATH1550H: Assignment:Questions: (Genetics) What is the probability that at most two of the offspring are aa?

  Binomial distributions

MATH1550H: Assignment:  Questions:  Let’s assume the department of Mathematics of Trent University has 11 faculty members. For i = 0; 1; 2; 3; find pi, the probability that i of them were born on Canada Day using the binomial distributions.

  Caselet on mcdonald’s vs. burger king - waiting time

Caselet on McDonald’s vs. Burger King - Waiting time

  Generate descriptive statistics

Generate descriptive statistics. Create a stem-and-leaf plot of the data and box plot of the data.

  Sampling variability and standard error

Problems on Sampling Variability and Standard Error and Confidence Intervals

  Estimate the population mean

Estimate the population mean

  Conduct a marketing experiment

Conduct a marketing experiment in which students are to taste one of two different brands of soft drink

  Find out the probability

Find out the probability

  Linear programming models

LINEAR PROGRAMMING MODELS

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd