Compute and plot the leverage of each point

Assignment Help Applied Statistics
Reference no: EM132263005

Method of Data Analysis Assignment -

Please use Rmarkdown to write your solutions and submit your solutions with relevant R code included as a pdf file.

Question - Census data was collected on the 50 states and Washington, D.C. We are interested in determining whether average lifespan (LIFE) is related to the ratio of males to females in percent (MALE), birth rate per 1,000 people (BIRTH), divorce rate per 1,000 people (DIVO), number of hospital beds per 100,000 people (BEDS), percentage of population 25 years or older having completed 16 years of school (EDUC) and per capita income (INCO). The data stored in the data file Census.txt can be found on the course website.

Answer the following questions.

Part 1: In this Part, compute by hand using matrix formulas. DO NOT USE lm() command in this part.

We consider a multiple linear regression model with LIFE (y) as the response variable, and MALE (x1), BIRTH (x2), DIVO (x3), BEDS (x4), EDUC (x5), and INCO (x6), as predictors. Answer the following questions using least square estimates in term of matrix formulas.

(a) Compute and report the least-squares estimates. Write down the least-squares regression equation.

(b) Explain in context what the coefficients corresponding to MALE and BIRTH mean.

(c) Compute the biased and the unbiased estimates of the error variance σ2.

(d) Using the unbiased estimate of error variance, Compute the standard errors of the estimators of the regression coefficients.

(e) Compute the coefficient of determination. Give a practical interpretation of your result.

Part 2: In this part, you may use all R commands you need, including lm() function, to answer the following questions.

(a) Fit the MLR model with LIFE (y) as the response variable, and MALE (x1), BIRTH (x2), DIVO (x3), BEDS (x4), EDUC (x5), and INCO (x6), as predictors.

(b) At level α = 5%, conduct the F-test for the overall fit of the regression. Comment on the results.

(c) At level α = 1%, test each of the individual regression coefficients. Do the results indicate that any of the explanatory variables should be removed from the model?

(d) Determine the regression model with the explanatory variable(s) identified in part (c) removed. Write down the estimated regression equation.

(e) Perform a partial F-test at level α = 1% to determine whether the variables associated with MALE and INCO can be removed from the model.

(f) Compute and report the F test statistic for comparing the two models

E(Yi|xi) = β0 + β1xi1,

E(Yi|xi) = β0 + β1xi1 + β2xi2 + β3xi3 + β4xi4 + β5xi5 + β6xi6,

(g) Perform a partial F-test at level α = 1% for comparing the two models

E(Yi|xi) = β0,

E(Yi|xi) = β0 + β1xi1 + β2xi2,

(h) Compute and report the terms in the decomposition

SSreg(β1, β2, β30) = SSreg(β30) + SSreg(β20, β3) + SSreg(β10, β3, β2)

(i) Suppose we are interested in fitting a regression model using LIFE as the response variable and some subset of the variables (MALE, BIRTH, DIVO, and INCO) as predictor.

(i.1) Perform variable selection by ?nding the subset model that minimizes the AIC criteria. State the 'best model'.

(i.2) Perform variable selection using forward selection. State the 'best model'.

(i.3) Perform variable selection using backward selection. State the 'best model'.

Part 3: In this part, you may use all R commands you need.

We consider the multiple linear regression with LIFE (y) as the response variable, and MALE, BIRTH, DIVO, BEDS, EDUC, and INCO, as predictors.

(a) Plot the standardized residuals against the fitted values. Are there any notable points. In particular look for points with large residuals or that may be influential.

(b) Compute and plot the leverage of each point. Identify any points that have a leverage larger than 0.5.

(c) Compute the Cook's distance for each point. Identify any points that have a Cook's distance larger than 1. Are these the same observations as those seen in part (b)?

(d) Plot the standardized residuals against the variable BEDS. Specifically mark the point corresponding to Washington, D.C. What can you say about this observation?

(e) Remove the observation corresponding to Washington, D.C. and refit the model. Are there any notable differences with the model fit in part (a)?

(f) Plot the standardized residuals against each of the 6 explanatory variables. Specifically mark the observation corresponding to UT. What is notable about this state?

(g) Remove the observation corresponding to UT and refit the model. Are there any notable differences with the model fit in part (a)? In particular, how does UT's exclusion impact the R2 value?

Textbook - Springer Texts in Statistics - A Modern Approach to Regression with R. Authors: Simon J. Sheather. ISBN: 978-0-387-09607-0.

Attachment:- Assignment Files.rar

Reference no: EM132263005

Questions Cloud

Should firms outsource their innovation : Should firms pursue a strategic policy intended to control a network of partners and suppliers around the world
Discuss the important of job satisfaction : Discuss The Important Of Job Satisfaction in Organization Behavior
Frito-lay to the next level of outstanding maintenance : What might be done to help take Frito-Lay to the next level of outstanding maintenance? Consider factors such as sophisticated software.
Sexual harassment in the workplace : What is the scale and scope of sexual harassment in the workplace?
Compute and plot the leverage of each point : STA302/1001H1S Method of Data Analysis Assignment, University of Toronto, Canada. Compute and plot the leverage of each point
An order report is anticipated capacity requirements : An order report is the anticipated capacity requirements calculated based on both released and planned orders of the MRP plan. ?
Business requirements are the detailed set of business : Business requirements are the detailed set of business requests that any new system must meet in order to be successful.
Describes the characteristics and roles as a counselor : Write a 1,200-1,500-word essay that describes the characteristics and roles you hope to embody as a counselor and the counselor dispositions that you want.
Making business decisions : Wikis are Web-based tools that make it easy for users to add, remove, and change online content.

Reviews

len2263005

3/21/2019 11:48:20 PM

Instructions - Please save all of part 1 as one pdf and all of part 2 as a separate pdf and another different pdf just for part 3. Instructions: This is individual assignment. It is worth 100 points. Please use Rmarkdown to write your solutions and submit your solutions with relevant R code included as a pdf file via Crowdmark.

Write a Review

Applied Statistics Questions & Answers

  Hypothesis testing

What assumptions about the number of pedestrians passing the location in an hour are necessary for your hypothesis test to be valid?

  Calculate the maximum reduction in the standard deviation

Calculate the maximum reduction in the standard deviation

  Calculate the expected value, variance, and standard deviati

Calculate the expected value, variance, and standard deviation of the total income

  Determine the impact of social media use on student learning

Research paper examines determine the impact of social media use on student learning.

  Unemployment survey

Find a statistics study on Unemployment and explain the five-step process of the study.

  Statistical studies

Locate the original poll, summarize the poling procedure (background on how information was gathered), the sample surveyed.

  Evaluate the expected value of the total number of sales

Evaluate the expected value of the total number of sales

  Statistic project

Identify sample, population, sampling frame (if applicable), and response rate (if applicable). Describe sampling technique (if applicable) or experimental design

  Simple data analysis and comparison

Write a report on simple data analysis and comparison.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  What is the probability

Find the probability of given case.

  Frequency distribution

Accepting Manipulation or Manipulating

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd