Outliers - reasons for screening data, Advanced Statistics

Assignment Help:

Outliers - Reasons for Screening Data

Outliers are due to data entry errors, subject is not a member of the population that the sample is trying to represent, or the subject is really different. Statistical tests are quite sensitive to outliers so this problem should be addressed.

Univariate outliers are easy to detect (z-scores, box plots, histograms, etc.) standard scores larger than +/-3 are outliers (consider 4 is n>100 or 2.5 if n<10)

Multivariate outliers are difficult to detect. Mahalanobis distance is one powerful technique to use in this case (discussed later). This is evaluated as a chi-square statistic with degrees of freedom equal to number of variables in the analysis. A chi-sqaure statistic value that is significant beyond p<0.001 level determines outliers.

In most cases, it is ok to drop the value from the sample. One can also take steps to reduce the relative influence of outliers if the researcher decides to include the values in the analysis.


Related Discussions:- Outliers - reasons for screening data

Determine allowable setup cost, A metal fabrication process uses a die-cast...

A metal fabrication process uses a die-cast metal fastener at a uniform rate of 300 units per year. Currently, this item is currently purchased from an external supplier at a unit

Function of Power, In an experiment, power is a function of 1. The number o...

In an experiment, power is a function of 1. The number of variables being measured and the beta level 2. The effect size, internal validity and the beta level 3. The number of part

Binomial distribution with continuity correction, Records on the computer m...

Records on the computer manufacturing process at Pratt-Zungia Limited show that the percentage of defective computers sent to  customers has been 5% over the last few years. Shipme

Explain perturbation theory, Perturbation theory : The theory useful in ass...

Perturbation theory : The theory useful in assessing how well a specific algorithm or the statistical model performs when the observations suffer less random changes. In very commo

Autocorrelation, This graph for Cross Correlation Function for RES1, RES1 s...

This graph for Cross Correlation Function for RES1, RES1 shows that there is possibly negative autocorrelation as there are alternating spikes; also the first spike is negative whi

Assignment, Hi there i have send mail on info@expertminds regarding assignm...

Hi there i have send mail on info@expertminds regarding assignment, i am waiting nearly 45 minutes for reply

Case series, Case series : It is the series of reports on the condition of ...

Case series : It is the series of reports on the condition of the individual patients made by treating physician. Such reports might be helpful and informative for the rare disease

Copulas, Invariant transformations to combine marginal probability function...

Invariant transformations to combine marginal probability functions to form multivariate distributions motivated by the need to enlarge the class of multivariate distributions beyo

White''s general heteroscedasticity test, The Null Hypothesis - H0:  γ 1 =...

The Null Hypothesis - H0:  γ 1 = γ 2 = ...  =  0  i.e.  there is no heteroscedasticity in the model The Alternative Hypothesis - H1:  at least one of the γ i 's are not equal

Statistical & Quantitative Methods , Given: There are 4 jobs and 4 persons...

Given: There are 4 jobs and 4 persons. The cost incurred for each person and each job is as follows: Persons Job 1 Job 2 Job 3 Job 4 A 10 9 21 11 B 15 12 25 17 C 12 10 20 12 D 17

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd