Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
National lotteries : Games of chance held to heave money for particular causes. The first held in the UK took place in the year 1569 principally to raise money for repair of the Ci

the problem that demonstrates inference from two dependent samples uses hypothetical data from TB vaccinations and the number of new cases before and after vaccinations for cases o

An oil company is considering whether or not to bid for an offshore drilling contract. If they bid, the value would be $600m with a 65% chance of gaining the contract. The company

Recurrence risk : Usually the probability that an individual experiences an event of interest given previous experience(s) of the event; for example, the probability of recurrence

Mendelian randomization is the term applied to the random assortment of alleles at the time of gamete formation, a process which results in the population distributions of genetic

The Expectation/Conditional Maximization Either algorithm which is the generalization of ECM algorithm attained by replacing some of the CM-steps of ECM which maximize the constrai

The Current status data arise in the survival analysis if the observations are limited to the indicators of whether or not the event of interest has happened at the time the sample

Ignorability : The missing data mechanism is said to be ignorable for likelihood inference if (1) the joint likelihood for the responses of the interest and missing data indicators

The analysis of data which are the functions observed continuously, for instance, functions of time. Basically a collection of statistical techniques or methods for answering quest

O. J. Simpson paradox is a term coming from the claim made by the defence lawyer in murder trial of O. J. Simpson. The lawyer acknowledged that the statistics demonstrate that onl