Best subsets regression, Advanced Statistics

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.

Posted Date: 3/4/2013 6:44:10 AM | Location : United States







Related Discussions:- Best subsets regression, Assignment Help, Ask Question on Best subsets regression, Get Answer, Expert's Help, Best subsets regression Discussions

Write discussion on Best subsets regression
Your posts are moderated
Related Questions
Profile plots  is a technique of representing the multivariate data graphically. Each of the observation is represented by a diagram comprising of a sequence of equispaced vertical

Barrett and Marshall Model for conception : A biologically reasonable model for the probability of conception in a particular menstrual cycle, which supposes that the batches of sp

The procedure in which the prior distribution is required in the application of Bayesian inference, it is determined from empirical evidence, namely same data for which the posteri

It is an informal method of assessing the effect of the publication bias, generally in the context of the meta-analysis. The effect measures from each of the reported study are plo

importance of time series on the number of babies given birth

an oil company is considering whether or not to bid for an offshore drilling contract. The bid would cost $60 with a 65% chance of gaining the contract. Outcome success Probability

Coefficient of concordance : The coef?cient is taken in use to assess the agreement among m raters ranking n individuals according to some of the speci?c characteristic. Which can



This is acronym for the Epidemiological, Graphics, Estimation and Testing of the program developed for the analysis of the data from studies in epidemiology. It can be made in use