Best subsets regression, Advanced Statistics

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.

Posted Date: 3/4/2013 6:44:10 AM | Location : United States







Related Discussions:- Best subsets regression, Assignment Help, Ask Question on Best subsets regression, Get Answer, Expert's Help, Best subsets regression Discussions

Write discussion on Best subsets regression
Your posts are moderated
Related Questions
Kaiser's rule is the  rule frequently used in the principal components analysis for selecting the suitable the number of components. When the components are derived from correlati

Harris and Stevens forecasting is the method of making short term forecasts in the time series which is subject to abrupt changes in pattern and the transient effects. Instances o

i will like to submit my project for you to do on chi-square, ANOVA, and correlation and simple regression. how can we do this?

Quasi-experiment is a term taken in use for studies which resemble experiments but are weak on some of the characteristics, particularly that allocation of the subjects to groups

The number of passengers arriving at an airport terminal average 1200 each hour. To process passengers (check in, take luggage, etc) take an average of 6 minutes each. There are

How to estimate MLE for statistical anslysis using Markov Model?

Longini Koopman model : In epidemiology the model for primary and secondary infection, based on the classification of the extra-binomial variation in an infection rate which might

A term usually used for unobserved individual heterogeneity. Such variation is of main concern in the medical statistics particularly in the analysis of the survival times where ha

A study not involving the passing of time. All information is collected at the same time and subjects are contacted only once. Many surveys are of this type. The temporal sequence

An investor with a stock portfolio sued his broker, claiming that a lack of diversification in his portfolio had led to poor performance. The data, shown below, are the rates of re