Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
The procedures for extracting the pattern in a series of observations when this is obscured by the noise. Basically any such technique or method separates the original series into

Described by the leading proponent as 'the conscientious, explicit, and judicious uses of present best evidence in making the decisions about the care of individual patients, and

It is the multivariate normal random vector which satisfies certain conditional independence suppositions. This can be viewed as a model framework which contains a wide range of st

Input to the compress is a text le with arbitrary size, but for this assignment we will assume that the data structure of the file fits in the main memory of a computer. Output of

The statistical methods for estimation and inference which are based on a function of sample observations, probability distribution of which does not rely upon a complete speci?cat

Non parametric maximum likelihood (NPML) is a likelihood approach which does not need the specification of the full parametric family for the data. Usually, the non parametric max

The nonparametric Bayesian inference approach to using the finite mixture distributions for modelling data suspected of the containing distinct groups of observations; this approac

You may have the opportunity to buy some electronic components. These components may be reliable (1) or unreliable (2). The potential pro?ts are £10,000 if the components are rel

Ask questT-TEST? ion #Minimum 100 words accepted#

Linearity - Reasons for Screening Data Many of the technics of standard statistical analysis are based on the assumption that the relationship, if any, between variables is li