Reference no: EM132484667
STA60005 Statistical Practice Assignment - Swinburne University of Technology, Australia
Question 1 - Assume that the mean duration of pregnancy is 266 days with a standard deviation of 24 days. Researchers were interested in determining if women who smoke run the risk of a shorter duration of pregnancy. They intend to take a sample of 36 pregnant women who smoke to test if the mean duration of pregnancy differs from 266, using a level of significance of .05. Assume that the standard deviation of the duration of pregnancy for women who smoke in the population is 24 days.
a) State the statistical hypotheses.
b) Assume that the researchers intend to conduct the hypothesis test as defined by the statistical hypotheses in part a). If they choose to use a sample size of 36, will this study have sufficient power to detect a significant difference in mean duration of pregnancy if the actual effect is a decrease of 12 days? Assume that the standard deviation of duration of pregnancy for women who smoke in the population is 24 days. and α = .05?
In responding to this question, you must include the following
What values of the sample mean for samples of size 36 would result in H0 being rejected. You must show all your workings.
A step by step explanation of your calculations for the power of this proposed study.
A response to the question of whether this planned study would have sufficient power, and provide justification for your conclusion.
A diagram illustrating the relationship between power, type 1 error and type 2 error for this question. In the diagram, you must include the following:
- sampling distribution of the mean if H0 is true
- the sampling distribution of the mean if the mean duration of pregnancy decreases by 12 days
- Shaded area with labelling corresponding to Pr(type 1 error), Pr(type 2 error) and Power complete with the probabilities.
- Label any critical values, the sample mean values below or above which would lead to the rejection of H0.
The diagram can hand-drawn or drawn using any other program of your choosing. If hand-drawn just take a photo of it with your phone and then put the photo into your assignment (or use whatever other methods you like to include the hand-drawn diagram into the assignment). The diagram does not need to be perfect, just needs to illustrate the concepts. Please note that students tend to draw this diagram by hand.
c) If the researchers planned to decrease the sample size, would the power increase or decrease? Justify your answer by explaining what would change in the diagram you drew in part b). Please do not redo the diagram simply explain how the diagram would change.
Question 2 - In many underdeveloped countries, diarrhoea is a major public health problem, especially for babies. Diarrhoea leads to dehydration, which results in millions of deaths each year worldwide. An antacid medication has been shown to reduce diarrhoea in adults. Researchers in a South American country randomly allocated babies suffering from diarrhoea to two groups, the control and the treatment group. The medical staff adminis tering the treatment and the patients (and their respective families) were unaware of whether the baby was in the control or treatment group.
Researchers were interested in determining if an antacid would reduce diarrhoea in infants suffering from diarrhoea.
In their study, all infants received the standard therapy for diarrhoea: oral rehydration. In addition to the rehydration, babies were randomly allocated to either receive the antacid or receive a placebo.
The total stool volumes for all infants over the course of their illness was measured. To adjust for body size, the researchers divided the stool volumes by body weight to obtain their outcome of interest: stool output per kilogram of body weight. A lower value of stool output would be considered to be a positive outcome in terms of reducing diarrhoea. The data can be found in the file STA60005Ass1_2020.sav.
The following is a description of the variables in the data file:
Column Variable Name: Description of Variable
Stool: Stool output grams per kilogram of body weight
Group: 0 = Control, 1=Treatment
a) Is this an observational study or a randomized experiment? Justify your response. Is it possible to establish a causal link between a reduction in diarrhoea and taking an antacid using this data? Can the finding be generalized to a population? Justify your response.
b) Obtain and paste the descriptive statistics for the Stool output for the control and treatment groups as well as comparative boxplots of the distribution of stool output for both groups, use Analyze → Descriptive Statistics → Explore.
i. Using the output to discuss descriptively whether the sample data supports the researcher's hypothesis that the antacid can reduce diarrhoea in infants.
ii. State the formula for the standard error of the mean. Use the appropriate statistics obtained from the descriptives table from the Explore procedure to calculate the standard errors of the mean stool output of the control and treatment group. Correct use of statistical symbols is important here.
iii. For the Control group, state and give the appropriate symbols for the standard deviation and standard error of the stool output. Interpret these two statistics for this group. In your interpretations, use specifically the values for the control group, not just general definitions for the two terms.
iv. Using the information obtained from the Explore procedure, state the 95% confidence interval for the mean stool output for each group?
v. Which confidence interval in question 1biv is wider? Explain why?
vi. Based only on the information provided by the two confidence intervals in question 1biv, what could we conclude about the differences in the mean stool output in the population for those given antacid compared to those not given antacid if an appropriate t-test was conducted? Give an estimate (a range) for the p-value. Explain how you obtained your estimate for the p-value.
c) Explore the distributions of stool output for both groups using appropriate techniques to determine if the distributions are approximately normally distributed. Include both a discussion of the visual and numerical techniques and also a conclusion.
d) Transform the stool variable using the loge stool transformation. Use appropriate techniques to determine if loge stool for each group is approximately normally distributed?
e) Conduct an independent samples t test to determine if there are differences in stool output according to the group. Write a brief report to describe your findings. Use the loge stool as your dependent variable. Include an interpretation of the confidence interval for the difference as well as a conclusion that directly addresses the researcher's hypothesis that the antacid would reduce diarrhoea in infants suffering from diarrhoea.
f) Use an appropriate non-parametric test to determine if there are differences in stool output (not transformed) between the control and treatment group. Include a discussion of the relevant assumptions. Write a brief report summarizing your findings from the non-parametric test. Include a conclusion that directly addresses the researcher's hypothesis that an antacid would reduce diarrhoea in infants suffering from diarrhoea.
g) Which (if any) of the following tests would be the best choice for determining if there are differences in stool output between the groups. Support your decision by referring to appropriate graphs and or statistics that you have already produced in the assignment. You must address each option and state why it would be appropriate or inappropriate.
i. Independent samples t-test with stool as the DV
ii. Independent samples t-test with loge stool as the DV
iii. Non-parametric test with stool as the DV
Note you must conduct the statistical analysis and write the reports summarizing the results for parts e and f irrespective of the results of testing the assumptions. Please follow the examples provided in the notes with respect to writing reports.
Please be descriptive with your descriptions of graphs, describe the features of the graph that lead you to the conclusion you make, for example do not simply write the graph indicates the distribution is normally distributed, specifically point out the features of the graph that lead you to this conclusion. Likewise be careful to quote the statistics that support your conclusion appropriately. It is not sufficient to simply write "...was statistically significant", you must quote the appropriate test statistic with p-value etc.
Attachment:- Statistical Practice Assignment Files.rar