Missing data - reasons for screening data, Advanced Statistics

Assignment Help:

Missing Data - Reasons for screening data

In case of any missing data, the researcher needs to conduct tests to ascertain that the pattern of these missing cases is random.

Create dichotomous variable - non-missing vs missing for a specific variable. Run a simple independent samples t-test on a different variable in the collected sample to see if there are any significant differences.

Handling missing values:

1. Delete missing data (good idea if there are only a few missing cases)

2. Delete variables containing missing values (good idea if most of the missing values are concentrated to only a couple of variables. Still problematic if they are important to the ultimate goal of the research)

3. Estimate missing values

4. Prior knowledge

5. Replace missing values with the mean (main concern: lowers the calculated variance as compared to the unknown actual variance)
One variation involves using group means for missing values for cases involving group comparison analysis

6. Regression approach: use several IVs to explain the DV (that includes several missing values). Predict missing values using IV values.

7. Concerns include finding proper IVs that explain DV, estimates obtained from prediction more consistent with the scores used to predict them compared to the real values.

8. When we use any of the techniques described above, as a researcher we have to ascertain that our solution hasn't changed the results of the analysis (run the tests, with and without the treatment).


Related Discussions:- Missing data - reasons for screening data

Explain kleiner hartigan trees, Kleiner Hartigan trees is a technique for ...

Kleiner Hartigan trees is a technique for displaying the multivariate data graphically as the 'trees' in which the values of the variables are coded into length of the terminal br

Over dispersion, Over dispersion is the phenomenon which occurs when empir...

Over dispersion is the phenomenon which occurs when empirical variance in the data exceeds the nominal variance under some supposed model. Most often encountered when the modeling

Explain historical controls, Historical controls : The group of patients tr...

Historical controls : The group of patients treated in the past with the standard therapy, taken in use as the control group for evaluating the new treatment on the present patient

Greenhouse geissercorrection, Greenhouse geissercorrection is the method o...

Greenhouse geissercorrection is the method of adjusting the degrees of freedom of the within- subject F-tests in the analysis of the variance of longitudinal data so as to allow t

Randomized consent design, Randomized consent design is the design at firs...

Randomized consent design is the design at first introduced to overcome some of the perceived ethical problems facing clinicians entering patients in the clinical trials including

Probability weighting, Probability weighting is the procedure of attaching...

Probability weighting is the procedure of attaching weights equal to inverse of the probability of being selected, to each respondent's record in the sample survey. These weights

Explain Geometric distribution, Geometric distribution: The probability di...

Geometric distribution: The probability distribution of the number of trials (N) before the first success in the sequence of Bernoulli trials. Specifically the distribution is can

Z-tests, Hello! I am currently in graduate school earning a masters in ment...

Hello! I am currently in graduate school earning a masters in mental health counseling. I am in a stats course at current and we are reviewing z-scores. I am a little lost because

Explain post stratification adjustment, Post stratification adjustmen t: On...

Post stratification adjustmen t: One of the most often used population weighting adjustments used in the complex surveys, in which weights for the elements in a class are multiplie

Window estimates, Window estimates is a term which occurs in the context o...

Window estimates is a term which occurs in the context of the both frequency domain and time domain estimation for the time series. In the previous it generally applies to weights

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd