Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
Bubble plot : A method or technique for displaying the observations which involve three variable values. Two of the variables are used to make a scatter diagram and values of the t

Method of moments   is the procedure for estimating the parameters in a model by equating sample moments to the population values. A famous early instance of the use of the proced

Compliance : The extent to which the participants in a clinical trial follow trial protocol, for instance, following both the intervention regimen and trial procedures (clinical vi

Markers of disease progression : Quantities which form a general monotonic series throughout the course of the disease and assist with its modelling. In uasual such quantities are

The marketing manager of Handy Foods Ltd. is concerned with the sales appeal of one of the company's present label for one of its products. Market research indicates that supermark

Lagging indicators: The part of a collection of the economic time series designed to give information about the broad swings in measures of the aggregate economic activity known a

Profile plots  is a technique of representing the multivariate data graphically. Each of the observation is represented by a diagram comprising of a sequence of equispaced vertical

The probability distribution, f (x), of largest extreme can be given as    The location parameter, α is the mode and β is the scale parameter. The mean, variance skewn

Healthy worker effect : The occurrence whereby employed individuals tend to have lower mortality rates than those who are unemployed. The effect, which can pose the serious problem

Genomics  is the study of the structure, function and the evolution of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) sequences which comprise the genome of living organisms