Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
Described by the leading proponent as 'the conscientious, explicit, and judicious uses of present best evidence in making the decisions about the care of individual patients, and

Glejser test is the test for the heteroscedasticity in the error terms of the regression analysis which involves regressing the absolute values of the regression residuals for the

Cohort component method : A broadly used method or technique of forecasting the age- and sex-speci?c population to the upcoming years, in which the initial population is strati?ed

calculate the mean yearly value using the average unemployment rate by month

Cartogram : It is the diagram in which descriptive statistical information is displayed on the geographical map by the means of shading, different symbols or in some other possibly

Infant mortality rate is the ratio of the number of deaths during the calendar year among the infants under one year of age to the total number of live births during that particul

Intercropping experiments are the experiments including growing two or more crops at same time on the same patch of land. The crops are not required to be planted nor harvested at

HistogramĀ is the graphical representation of the set of observations in which class frequencies are represented by the regions of rectangles centred on the class interval. If the f

Comparative exposure rate : A measure of alliance for use in a matched case-control study, de?ned as the ratio of the number of case-control pairs, where the case has greater expos

The growth in bad debt expense for Johnston office supply Company over this time period.If this rate continues,estimate the percentage increase in bad debts for 1997,relative to 19