Best subsets regression, Advanced Statistics

Assignment Help:

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.


Related Discussions:- Best subsets regression

Probit analysis, Probit analysis  is the technique most commonly employed i...

Probit analysis  is the technique most commonly employed in the bioassay, specifically toxicological experiments where the group of animals is subjected to known levels of a toxin

Independent or Dependent variable, Whats the independent variable in the fo...

Whats the independent variable in the following sentence? -1) In a drug prevention program for boys and girls, will family-participation result in effective drug use reduction?

Generalized estimating equations (gee), Technically the multivariate analog...

Technically the multivariate analogue of the quasi-likelihood with the same feature that it leads to consistent inferences about the mean responses without needing specific supposi

Discriminant analysis, A term which covers the large number of techniques f...

A term which covers the large number of techniques for the analysis of the multivariate data which have in common the aim to assess whether or not the set of variables distinguish

Convex hull trimming, Convex hull trimming : A procedure which can be appli...

Convex hull trimming : A procedure which can be applied to the set of bivariate data to permit robust estimation of the Pearson's product moment correlation coef?cient. The points

Quality control procedures, Quality control procedures is the statistical ...

Quality control procedures is the statistical process designed to ensure that the precision and accuracy of, for instance, a laboratory test, are maintained within the acceptable

Atomistic fallacy, Atomistic fallacy : A fallacy which arises because of th...

Atomistic fallacy : A fallacy which arises because of the association between two variables at the individual level might vary from the association between the same two variables m

Math, A standard IQ test has a mean of 98 and a standard deviation of 16. W...

A standard IQ test has a mean of 98 and a standard deviation of 16. We want to be 99% certain that we are within 8 IQ points of the true mean. Determine the sample size

Over dispersion, Over dispersion is the phenomenon which occurs when empir...

Over dispersion is the phenomenon which occurs when empirical variance in the data exceeds the nominal variance under some supposed model. Most often encountered when the modeling

Correlated failure times, Data which occur when failure period is recorded ...

Data which occur when failure period is recorded which are dependent. Such type of data can arise in number contexts, for instance, in epidemiological cohort studies in which th

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd