What is the cutoff for high leverage in the given scenario

Assignment Help Basic Statistics
Reference no: EM131155919

STATISTICS Homework

1. Load the dataset "PatientSatisfaction.txt" into R. The goal of this analysis is to determine the best subset of predictor variables for determining patient satistfaction.

(a) Indicate which subset of predictor variables are optimal according to the following criteria: AICp, Mallow's Cp, BICp, and PRESSp (i.e. use best subsets variable selection).

(b) Do the four criteria listed above identify the same optimal model? Will this always be the case?

(c) Would forward stepwise regression have any advantages as a screening procedure over best subsets selection?

2. Load the dataset "Bears.csv". Data from n = 19 female wild bears of varying ages are used to estimate the relationship between Y = weight and X = neck circumference.

(a) One of the observations takes on value (x, y) = (10.5, 140). Identify this observation in the dataset. Visually, does this observation appear to be an outlier with respect to any of the following: X, Y, or general the linear relationship between Y and X (i.e. Y |X)? Justify with the appropriate plots.

(b) Compute the leverage for the observation (x, y) = 10.5, 140). What is the cutoff for high leverage in this scenario? Using the rule-of-thumb for leverages presented in class, state whether this point has high leverage?

(c) What is the consequence of including a point with high leverage?

(d) Using the lm() output, calculate each of the following for the (x, y) = (10.5, 140). For some, you will also need the leverage that was calculated above.

i. Studentized residual
ii. Studentized deleted residual
iii. Standardized DFFITS value.

(e) Using the quantities calculated in the previous part should (x, y) = (10.5, 140) be flagged as an outlier?

(f) For the observation (x, y) = (10.5, 140), calculate the following and justify whether this point has strong influence on the model fit?

i. DFBETA
ii. Cook's Distance

3. Data were collected from n = 51 "states" (including the District of Columbia) on the salaries of public school teachers.

(a) Regress Y = average teacher annual salary on X1 = spending per pupil in dollars, X2 a dummy indicator (1/0) for region 2, and X3 = a dummy indicator for region 3. Plot the standardized residuals versus fitted values.

(b) Plot a histogram of the studentized deleted residuals. Are there any outliers in this data? If so, list the index number.

(c) Create a plot of leverages from this model. Are there any outliers with respect to the covariates?

(d) Create plots of Cook's Distances and DFFITS to determine whether any observations have strong influence on the model fit.

Attachment:- HW_Data.zip

Reference no: EM131155919

Questions Cloud

Examine the political philosophies of each court : Examine the political philosophies of each court, and indicate significant changes in the law concerning your chosen issue that was witnessed through each court's era. Then, examine the current makeup of the U.S. Supreme Court and modern trends in..
Discuss the major factors in today society : Discuss the major factors in today's society that have made the need for independent audits much greater than it was 50 years ago.
Explain the relationships among audit services : Explain the relationships among audit services, attestation services, and assurance services, and give examples of each.
Discuss the reasoning behind this measure of risk : ome financial theorists consider the variance of the distribution of expected rates of return to be a good measure of uncertainty.- Discuss the reasoning behind this measure of risk and its purpose.
What is the cutoff for high leverage in the given scenario : What is the cutoff for high leverage in this scenario? Using the rule-of-thumb for leverages presented in class, state whether this point has high leverage?
Calculate the work in the compressor and the heat removed : A stream that contains a mixture of methane (25% by mol) and carbon monoxide is compressed from 1 bar, 35 to 12 bar. The compressor efficiency is 90%. Treating the mixture as an ideal gas, calculate the required work.
The institute of internal auditors : The Institute of Internal Auditors (IIA) is an international professional association of more than 170,000 members with global headquarters in Altamonte Springs, Florida. Throughout the world, The IIA is recognized as the internal audit profession's ..
Identify the internal control deficiencies and recommend : Superior Co. manufactures automobile parts for sale to the major U.S. automakers. Superior's internal audit staff is to review the internal controls over machinery and equipment and make recommendations for improvements when appropriate. The internal..
What is the composition of the final mixture : Compartment A contains 0.2 mol of pure methane at 50 °C, 1 bar. Compartment B contains 0.8 mol of a methane-ethane mixture at 100 °C, 1 bar with ymethane = 0.5. The partition is removed and the system reaches equilibrium.

Reviews

Write a Review

Basic Statistics Questions & Answers

  Is there difference in mean number of defects per shift

At 0.05 significance level, is there a difference in the mean number of defects per shift? Assume that the populations are approximately normally distributed and the variances are equal.

  Difference between point estimate and interval estimate

What is the difference between a point estimate and an interval estimate? How would you calculate a confidence interval with a z test?

  Appropriate hypotheses-simple random sample

Suppose a simple random sample of 20 non-English student have an average math score on the SAT exam of 528. What are the appropriate hypotheses?

  Purpose students will conduct one brief research exercise

purpose students will conduct one brief research exercise that will be written up as a research report using apa

  Problem regarding the absorbing states

States 0 and N are called absorbing states since once entered they are never left. Note that the preceding is a ?nite state random walk with absorbing barriers (states 0 and N).

  A health researcher is interested in determining whether or

a health researcher is interested in determining whether or not the speed at which people walk is related to their

  Importance of understanding statistics and the challenges

Locate an individual in your company who exemplifies these characteristics and interview them on the importance of understanding statistics and the challenges they may face with data. Ask them how they demonstrate these characteristics in their jo..

  Find the probability that the student passes the quiz

But did not study and randomly guesses each answer. Find the probability that the student passes the quiz with a grade of at least 70% of the questions correct.

  A researcher calculates a statistical test and obtains a p

a researcher calculates a statistical test and obtains a p value of .86. he decides to reject the null hypothesis. is

  The probability is 33 that a message sent over a

the probability is 33 that a message sent over a communication channel is garbled. if the message is sent repeatedly

  Use the fact that the standard normal distribution

use the fact that the standard normal distribution integrates to 1 to evaluate the following integral. i.e. useintegral

  Given for the results of a single experiment

Three ANOVA tables are given for the results of a single experiment. These tables give sequential (Type I) sums of squares. Construct a Type II ANOVA table. What would you conclude about which effects and interactions are needed?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd