Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
Formal graphical representation of the "causal diagrams" or the "path diagrams" where the  relationships are directed but acyclic (that is no feedback relations allowed). Plays an

The act of combining data from heterogeneous sources with the intent of extracting information that would not be available for any single source in isolation. An example is the com

Normality - Reasons for Screening Data Prior to analyzing multivariate normality, one should consider univariate normality Histogram, Normal Q-Qplot (values on x axis

Assume that a population is normally distributed with a mean of 100 and a standard deviation of 15. Would it be unusual for the mean of a sample of 20 to be 115 or more?

Case-control study : The traditional case-control study is the common research design in the epidemiology where the exposures to risk factors for cases (individuals getting the dis

Informed consent: The consent needed from each potential participant former to random assignment in the clinical trial as speci?ed in the year 1996 version of Helsinki declaration

Identification keys: The devices for identifying the samples from a set of known taxa, which contains a tree- structure where each node corresponds to the diagnostic question of t

The risk of being able to recognize the respondent's confidential information in the data set. Number of approaches has been proposed to measure the disclosure risk some of which c

HOW TO CONSTRUCT A BIVARIATE FREQUENCY DISTRIBUTION

A vague concept which occurs all through statistics. Essentially the term means the number of independent units of the information in an easy relevant to the estimation of the para