Best subsets regression, Advanced Statistics

Assignment Help:

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.


Related Discussions:- Best subsets regression

Error rate estimation, The term used for the estimation of the misclassific...

The term used for the estimation of the misclassification rate in the discriminant analysis. Number of techniques has been proposed for two-group situation, but the multiple-group

Multiple correlation coefficient, Multiple correlation coefficient is th...

Multiple correlation coefficient is the correlation among the observed values of dependent variable in the multiple regression, and the values predicted by estimated regression

Artificial neural network, Artificial neural network : A mathematical arran...

Artificial neural network : A mathematical arrangement modelled on the human neural network and designed to attack various statistical problems, particularly in the region of patte

Descriptive , Assume that a population is normally distributed with a mean ...

Assume that a population is normally distributed with a mean of 100 and a standard deviation of 15. Would it be unusual for the mean of a sample of 20 to be 115 or more?

Matching, Matching is the method of making a study group and a comparison ...

Matching is the method of making a study group and a comparison group comparable with respect to the extraneous factors. Generally used in the retrospective studies when selecting

Line-intersect sampling, Line-intersect sampling is a technique of unequal...

Line-intersect sampling is a technique of unequal probability sampling for selecting the sampling units in the geographical area. A sample of lines is drawn in a study area and, w

Particlefilters, Particlefilters is a simulation method for tracking movin...

Particlefilters is a simulation method for tracking moving target distributions and for reducing computational burden of the dynamic Bayesian analysis. The method uses a Markov ch

Growth curve analysis, Growth curve analysis is t he general term for metho...

Growth curve analysis is t he general term for methods dealing with development of the individuals over time. A classic instance includes recordings made on a group of children, sa

Find the expected value of perfect information, You may have the opportunit...

You may have the opportunity to buy some electronic components. These components may be reliable (1) or unreliable (2). The potential pro?ts are £10,000 if the components are rel

Extreme value distribution, The probability distribution, f (x), of largest...

The probability distribution, f (x), of largest extreme can be given as    The location parameter, α is the mode and β is the scale parameter. The mean, variance skewn

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd