Best subsets regression, Advanced Statistics

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.

Posted Date: 3/4/2013 6:44:10 AM | Location : United States







Related Discussions:- Best subsets regression, Assignment Help, Ask Question on Best subsets regression, Get Answer, Expert's Help, Best subsets regression Discussions

Write discussion on Best subsets regression
Your posts are moderated
Related Questions
Reciprocal transformation is a transformation of the form y =1/x, which is specifically useful for certain types of variables. Resistances, for instance, become conductances, and

Length-biased sampling : The bias which arises in the sampling scheme based on the visits of patient, when some individuals are more likely to be chosen than others simply because

Bartlett decomposition : The expression for the random matrix A which has a Wishart distribution as the product of the triangular matrix and the transpose of it. Letting each of x

Probabilistic matching is a method developed to maximize the accuracy of the linkage decisions based on the level of agreement and disagreement among the identifiers on different

Artificial neural network : A mathematical arrangement modelled on the human neural network and designed to attack various statistical problems, particularly in the region of patte

Non linear mapping (NLM ) is a technique for obtaining a low-dimensional representation of the set of multivariate data, which operates by minimizing a function of the differences

The biggest and smallest variate values among the sample of observations. Significant in various regions, for instance flood levels of the river, speed of wind and snowfall.

Difference between tretment design and experimental design

what is operational gaining

Healthy worker effect : The occurrence whereby employed individuals tend to have lower mortality rates than those who are unemployed. The effect, which can pose the serious problem