Best subsets regression, Advanced Statistics

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.

Posted Date: 3/4/2013 6:44:10 AM | Location : United States

Related Discussions:- Best subsets regression, Assignment Help, Ask Question on Best subsets regression, Get Answer, Expert's Help, Best subsets regression Discussions

Write discussion on Best subsets regression
Your posts are moderated
Related Questions
Cochrane collaboration : An international network of the individuals committed to preparing , maintaining and disseminating the systematic reviews of the effects of the health care

Given: There are 4 jobs and 4 persons. The cost incurred for each person and each job is as follows: Persons Job 1 Job 2 Job 3 Job 4 A 10 9 21 11 B 15 12 25 17 C 12 10 20 12 D 17

Glejser test is the test for the heteroscedasticity in the error terms of the regression analysis which involves regressing the absolute values of the regression residuals for the

Command-Line options Compression: C++:  ./compress  -f  myfile.txt  [-o  myfile.hzip  -s Java:  sh  -f  myfile.txt  [-o  myfile.hzip  -s] Decompression:

How large would the sample need to be if we are to pick a 95% confidence level sample: (i) From a population of 70; (ii) From a population of 450; (iii) From a population of 1000;

calculate the mean yearly value using the average unemployment rate by month

There is high level of fluctuation in a zigzag pattern in the time series for RESI1 which indicates that there is possibly negative autocorrelation present. Column C11 show

Raking adjustments  is an alternative to the post stratification adjustments in the complex surveys which ensures that the adjusted weights of the respondents conform to each of th

Compound symmetry : The property possessed by the variance-covariance matrix of the set of multivariate data when its chief diagonal elements are equal to each other, and in additi

Reasons for screening data     Garbage in-garbage out     Missing data          a. Amount of missing data is less crucial than the pattern of it. If randomly