Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
Monty Hall problem : A apparently counter-intuitive problem in the probability which gets its name from the TV game show, 'Let's Make a Deal' hosted by the Monty Hall. On show a pa

Generalized poisson distribution: The probability distribution can be defined as follows:   The distribution corresponds to the situation in which the values of the rand

A radically different approach of dealing with the uncertainty than the traditional probabilistic and the statistical methods. The necessary feature of the fuzzy set is a membershi

How is the rejection region defined and how is that related to the z-score and the p value? When do you reject or fail to reject the null hypothesis? Why do you think statisticians

The Null Hypothesis - H0: β 1 = 0 i.e. there is homoscedasticity errors and no heteroscedasticity exists The Alternative Hypothesis - H1: β 1 ≠ 0 i.e. there is no homoscedasti

Regression dilution is the term which is applied when a covariate in the model cannot be measured directly and instead of that a related observed value must be used in analysis. I

Calibration : A procedure which enables a series of simply obtainable but inaccurate measurements of some quantity of interest to be used to provide more precise estimates of the r

Poisson regression In case of Poisson regression we use ηi = g(µi) = log(µi) and a variance V ar(Yi) = φµi. The case φ = 1 corresponds to standard Poisson model. Poisson regre

The process of providing the numerical value for the population parameter on the basis of information gathered from a sample. If a single ?gure is computed for the unknown paramete

1. The production manager of Koulder Refrigerators must decide how many refrigerators to produce in each of the next four months to meet demand at the lowest overall cost. There i