Reference no: EM132290267
DATA ANALYSIS ASSIGNMENT - Analysis of Kaiser-Permanent Data Set
This assignment uses a data set that was collected as part of the Kaiser Permanente study of the oldest old.
You should know the difference between Description and Hypothesis.
Describing the study precedes analysis
1. Study background (where, when, who, and how, i.e. study design)
2. Sample characteristics (homogeneous, comparison)
Hypothesis
1. Formalize research question
2. Draw conclusion
Internal and external comparison
1. Go to the SAS On-Demand for Academics website, log in to your account.
2. Create a new program by clicking on the second icon in the left window pane.
3. The first line of your program should be a LIBNAME statement. This will grant you access to the library that contains the Kaiser Permanente dataset (called "old").
4. Create a PROC CONTENTS in order to see the names and types of variables contained in this dataset as well as the labels for each variable. To do this, type:
PROC CONTENTS DATA = datalib.old;
RUN;
Do not forget the semi-colons after "old" and "run"!!
5. After you've highlighted the code for the procedure, click on the icon at the top that looks like a running man. This will run the procedure.
USING THE BINOMIAL DISTRIBUTION TO CALCULATE PROBABILITY
6. Compute the probability of still being alive at the end of the study period (DTHFLAG = 0) and the confidence interval for this probability
What you should report:
- Frequency & percent of those of those dead and those not dead.
- P (DTHFLAG=0) - called "Proportion" in SAS.
- 95% Lower and Upper Confidence Limits - this is the 95% confidence interval.
COMPARING TWO MEANS USING A 2-INDEPENDENT SAMPLES T-TEST
When to use this test: when you have one numeric variable and one categorical variable that only has two categories. You will be comparing the mean value of the continuous variable for the two independent groups.
7. Researchers want to know if there is a statistically significant difference in the mean value of age for those who died compared to those who did not die? Conduct an independent t-test to determine if there is a significant difference in age (AGE_COMP) between people who died and those who did not (DTHFLAG).
What you should report:
- Null hypothesis - H0: μ1 = μ2
- Alpha (α) value: this is the level of significance = P(Reject H0|H0 is true)
- Mean age of those who died (μ AgeDTHFLAG=1) and those who did not die (μAgeDTHFLAG = 0)
- t-value: this is the test statistic value
- p-value (Pr > |t|): this is the significance value
How to decide what p-value to report (depending on Equality of Variances results):
- The Pooled results are used when the variances between the two samples are equivalent.
- The Satterthwaite results are used when the variances between the two samples are not equivalent.
Equality of Variances Test (using Pr > F):
- If p-value < 0.05, then the variances are not equivalent
- If p-value ≥ 0.05, then the variances are equivalent
Your decision on whether to Reject or Fail to Reject the Null Hypothesis using the Decision Rule:
- If p-value < 0.05, then Reject the Null Hypothesis
- If p-value ≥ 0.05, then Fail to Reject the Null Hypothesis
Your answer to the original research question.
8. Compute the odds ratio of death (DTHFLAG) for males versus females (SEX).
What to report:
- The Odds Ratio and 95% Confidence Interval
- The way to report this is to report the odds ratio with the confidence interval in parentheses 0.83 (0.73, 0.94). Make sure to only use two decimal places when reporting odds ratio information.
9. Write a one page summary of your analysis. Explain all your findings for steps 6, 7, and 8, in terms of both the point estimates (odds ratios) and the confidence limits (95% confidence intervals). Be as descriptive as possible and make sure to follow the 5-step process for hypothesis testing that you learned in chapter 7. With this summary you will be moving from descriptive analysis to inferential analysis. You can now discuss the relationship between the two variables you are analyzing; whereas before, you could only describe the individual variable you were analyzing.
Remember: For every statement you make about a variable, in your interpretation, make sure to support that statement with evidence from the descriptive statistics you have produced. Also, this paper should be typed and double-spaced in a Word document and submitted in the dropbox for this assignment. As always, do not forget to type your name in the document!
Attachment:- Assignment File.rar