Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
Machine learning  is a term which literally means the ability of a machine to recognize patterns which have occurred repetitively and to improve its performance based on the past

Principal components analysis is a process for analysing multivariate data which transforms original variables into the new ones which are uncorrelated and account for decreasing

Ask quesoil company is considering whether or not to bid for an offshore drilling contract. If they bid, the value would be $600m with a 65% chance of gaining the contract. The com

Glim is the software package specifically suited for fitting the generalized linear models (the acronym stands for the Generalized Linear Interactive Modelling), including the log

Barnard, George Alfred (1915^2002) : Born in Walthamstow in the east of London, Barnard achieved a scholarship to St. John's College, Cambridge, from where he graduated in the math

Tree is the term from the branch of the mathematics which known as the graph theory, used to describe any set of the straight-line segments joining the pairs of points in some pro

with the help of regression analysis create a model that best describes the situation. Indicate clearly the effect that each factors given in the attached file and other factors ma

Multimodal distribution is the probability distribution or frequency distribution with number of modes. Multimodality is frequently taken as an indication which the observed di

Median is the value in a set of the ranked observations which divides the data into two parts of equal size. When there are an odd number of observations the median is middle v

The method or technique for displaying the relationships between categorical variables in a type of the scatter plot diagram. For two this type of variables displayed in the form o