Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
The variables appearing on the right-hand side of equations defining, for instance, multiple regressions or the logistic regression, and which seek to predict or 'explain' response

Quota sample is the sample in which the units are not selected at the random, but in terms of a particular number of units in each of a number of categories; for instance, 10 men

Continual reassessment method: An approach which applies Bayesian inference for determining the maximum tolerated dose in a phase I trial. The method starts by assuming a logistic

The term used for the estimation of the misclassification rate in the discriminant analysis. Number of techniques has been proposed for two-group situation, but the multiple-group

Generally the final stage of an exploratory factor analysis in which factors derived initially are transformed to build their interpretation simpler. Generally the target of the pr

The tabulation of a sample of observations in terms of numbers falling below particular values. The empirical equivalent of the growing probability distribution. An example of such

Duck Lovers Unlimited (DLU) Inc. assembles specially configured light jet aircrafts for airborne duck hunting. The quarterly demand forecasts for the upcoming fiscal year are:

I need help solving a problem using excel.

Bivariate survival data : The data in which the two related survival times are of interest. For instance, in familial studies of disease incidence, data might be available on the a

Misspecification  is the term is applied to describe the assumed statistical models which are incorrect for one of the several of reasons, for instance, using the wrong probability