Missing data - reasons for screening data, Advanced Statistics

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).

Posted Date: 3/4/2013 6:07:24 AM | Location : United States







Related Discussions:- Missing data - reasons for screening data, Assignment Help, Ask Question on Missing data - reasons for screening data, Get Answer, Expert's Help, Missing data - reasons for screening data Discussions

Write discussion on Missing data - reasons for screening data
Your posts are moderated
Related Questions
Committees to monitor the accumulating data from the clinical trials. Such committees have chief responsibilities for ensuring the continuing safety of the trial participants, rele

The GRE has a combined verbal and quantitative mean of 1000 and a standard deviation of 200.

Concordant mutations test : A statistical test used in the cancer studies to determine whether or not a diagnosed second primary tumour is biologically independent of the original

Canonical correlation analysis : A process of analysis for investigating the relationship between the two groups of variables, by ?nding the linear functions of one of the sets of

Multi dimensional unfolding is the form of multidimensional scaling applicable to both the rectangular proximity matrices where the rows and columns refer to the different sets of

elements , importance, limitation, and theories

Cellular proliferation models : Models are used to describe the growth of the  cell populations. One of the example is the deterministic model   where N(t) is the number of cel

Convex hull trimming : A procedure which can be applied to the set of bivariate data to permit robust estimation of the Pearson's product moment correlation coef?cient. The points

This term sometimes is applied to the model for explaining the differences found between naturally happening groups which are greater than those observed on some previous occasion;

Product-limit estimator is a method for estimating the survival functions for the set of survival times, some of which might be censored observations. The logic behind the procedu