Best subsets regression, Advanced Statistics

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.

Posted Date: 3/4/2013 6:44:10 AM | Location : United States







Related Discussions:- Best subsets regression, Assignment Help, Ask Question on Best subsets regression, Get Answer, Expert's Help, Best subsets regression Discussions

Write discussion on Best subsets regression
Your posts are moderated
Related Questions
Mean-range plot   is the graphical tool or device useful in selecting a transformation in the time series analysis. The range is plotted against the mean for each of the seasona


Model is the description of the supposed structure of a set of observations which can range from a fairly imprecise verbal account to, more commonly, a formalized mathematical exp

Cascadedparameters: A group of parameters which is interlinked and where selecting the value for the ?rst parameter affects the choice and option available in the subsequent param


The approach of controlling the error rate in an exploratory analysis where number of hypotheses are tested, but where the strict control which is provided by multiple comparison p

Interim analyses : An analysis made before the planned end of a clinical trial, typically with the aim of detecting the treatment differences at the early stage and thus preventing


Suppose that $4 million is available for investment in three projects.  The probability distribution of the net present value earned from each project depends on how much is invest

The term which is used in the industrial experimentation, where there is commonly a large set of candidate factors believed to have the possible significant influence on the respon