What is the mean number of differences per base-pair

Assignment Help Applied Statistics
Reference no: EM132281135

Experimental Design and Statistics Assignment -

Section 1: Which test?

The Human Microbiome Project analyzed the diversity of the microbial communities that live in and on the human body by taking samples from healthy individuals, sequencing the DNA of the microbes that were present in different regions of the body (see image in attached file). This allowed the identification and of the different taxa of bacteria present in each region, as well as quantifying the relative number of each taxon.

There are many statistical questions that can be addressed with these data. For each research question below, state the null and alternate hypotheses and the test you would use, including the variables to be tested. Be as specific as possible, including whether the test should be one- or two- tailed where appropriate.

A) Is there a difference in the number of bacterial taxa present in the saliva of men and women? Assume that the number of taxa present follows a normal distribution, with the same standard deviation in men and women.

H0:

HA:

Test:

B) Do people tend to harbor more bacterial taxa on the skin behind their ears or in their elbows? Assume the measurements for the left and right sides for each area are combined for each person.

H0:

HA:

Test:

C) An earlier paper proposed that individuals could be classified as belonging to one of three "enterotypes" based on the types of bacteria present in their gut. Is there is a difference in the frequencies of the three enterotypes among meat-eaters and vegetarians?

H0:

HA:

Test:

D) The enterotype hypothesis come partly from the observation that the distribution of frequencies of some bacterial groups across individuals are bimodal; for one of these taxa, Prevotella, people have either fairly high frequencies of Prevotella, or nearly undetectable levels. Few people have moderate frequencies. You want to test whether the frequency of Prevotella in the gut is affected by dietary fat levels, so you talk to a friend who has been doing an unrelated study where subjects were randomly assigned to either a high or low fat diet. You do not know what each individual's Prevotella level was before the study began, but you can measure the current level.

H0:

HA:

Test:

Section 2: Snakes and Snails

A number of snake species in south-east Asia have evolved to prey extensively or exclusively on land snails, and have evolved special morphological features to facilitate extracting snails from their shells, including jaws with many teeth to grip the slippery, slimy beasts. Most snails' shells that coil to the right, so a snake with a similar asymmetry in its own morphology might have an advantage in predation.

Researchers measured the asymmetry in snake jaws across a number of snake species by counting the number of teeth on the right and left side of the jaw (R and L, respectively) and calculating an asymmetry index: 100 × (R-L )/(R+L) . This index was normally distributed within species.

A) Why did they calculated the asymmetry index rather than just using R-L?

B) One of the snake species, Pareas iwasakii, had a mean asymmetry index in a sample of 28 snakes of 17.5, with a standard deviation of 8.5. Perform an appropriate test to determine if P. iwasakii shows significant asymmetry in tooth number. Be sure to clearly state your conclusions.

C) To test whether an asymmetrical jaw was helpful in predation against coiled snails, the researchers tested snake predation success on a number of different snails with either left- or right-handed shells. The snakes were scored based on the frequency with which they successfully extracted and ate the snails. Each snake was tested only on one type of shell. Using the data below, test whether the snakes are better at extracting snails with coils of one direction or the other. You may assume the extraction frequencies follow a normal distribution in each group.

 

left-handed shells

 

right-handed shells

Success Rate (%)

80

68

57

79

82

92

91

75











D) To improve the experiment above, a scientist decides to (1) test more snakes, (2) have each snake attempt to open both left and right handed shells. To simplify planning, she tests each snake (3) first on left-handed shells, then on right handed shells. Describe the effects on sampling error and/or bias for each of the three modifications.

Section 3: False and False

All of the following statements are false. Please correct the statement or explain the error.

A) According to the Central Limit Theorem, the larger the sample size, closer a sample's distribution will be to the normal distribution.

B) An experiment with a larger sample size will always be more accurate, with less bias, than one with smaller sample.

C) A scientist observed grizzly bears fishing for salmon in a stream. After the bear has left, she collects the fish carcasses and measures the jawbones of the fish to estimate their sizes. In a sample of 10 fish, she finds a mean jawbone length of 6.8 cm, with a standard deviation of 1.2 cm. Assuming jaw lengths in the population are normally distributed, her 95% confidence interval for the mean is 6.05 - 7.54 cm.

D) 6.8 cm is an unbiased estimate of the mean jaw length of the salmon in the stream.

E) In a case-control study of rates of smoking and lung cancer in Beijing, 126 of 226 smokers were found to have lung cancer, as compared to 35 lung cancer cases in a sample of 96 non-smokers. This means the odds ratio for lung cancer associated with smoking (in Beijing) is 1.53.

F) The odds ratio for lung cancer associated with smoking is much lower in China than in the United States. This suggests that lung cancer rates in China are lower than in the United States.

Section 4: Oh, reporters!

Article - Putting a Value to 'Real' in Medical Research By NICHOLAS BAKALAR.

The first paragraph and the last one are mostly okay, though I might have some quibbles. The real trouble is in that middle paragraph.

A) Rewrite the first sentence of the second paragraph to make it accurate.

B) The last sentence of the second paragraph implies that a p-value of 0.06 indicates that a study's results "were probably due only to chance." Why is this incorrect?

C) If we actually wanted to quantify the probability that the results of an experiment were due to chance, what probability would we need to know in addition to the p-value?

Section 5: Elephant Evolution

Recently, it was discovered that African elephants, previously classified as one species, are actually two distinct species: the African forest elephant and the African savannah elephant. You want to get a sense of the rate at which differences have accumulated in DNA between the forest and savannah elephants, so you sequence 1000 base pair segments of DNA from each of 100 genetic regions in the two species, and count the number of differences between the species. The results appear below.

Differences

Number of regions

Expected number of regions

0

34

24.91

1

35

34.62

2

17

24.06

3

6

11.15

4

3

3.87

5

2

1.08

8

1

0.00861

9

1

0.00133

13

1

0.000000289

A) What is the mean number of differences per base-pair between the two species?

B) I have pre-calculated the expected values for the number of regions with a given number of differences. What distribution did I use? What is the null hypothesis associated with this set of expected values?

C) Perform the appropriate test of the null hypothesis and report your results.

D) By sequencing the Asian elephant and wooly mammoth, it is sometimes possible to identify whether a mutation that separates the two African species occurred in the ancestors of the forest elephants or savannah elephants. Looking at those mutations, we can then classify them by whether the mutation occurred in an amino acid coding region or between genes (intergenic). If 8 of 56 mutations that occurred in the forest elephant are from coding regions and 9 of 48 mutations in the savannah elephant are from coding regions, do the two species differ in the proportion of their mutations that occur in coding regions? Perform the appropriate test and report your results.

Attachment:- Assignment Files.rar

Reference no: EM132281135

Questions Cloud

How might you assess system requirements : How do these attributes impact the quality of requirements? How might you assess system requirements based off these attributes?
?control over price and an oligopoly leveraging : ?Control over price and an oligopoly leveraging it can be a problem, just look at what happened to the price for insulin.
Define arbitrage-economic data and analyses : Define arbitrage. Economic data and analyses have failed to explain international asset arbitrage behaviour or condition with economic variables
Make the entity-relationship diagram : "Zip Guys, Inc. runs a large network of auto part stores. Each part they offer for sale is identified by an SKU assigned by Zip Guys. For each part
What is the mean number of differences per base-pair : BIOL B215 Experimental Design and Statistics Assignment - What is the mean number of differences per base-pair between the two species
Flowchart and pseudocode for a program : Flowchart and pseudocode for a program that takes a user input consisting of an integer number between 0 and 99 and outputs its conversion as binary.
Child relationships between the processes : Identify that parent/child relationships between the processes.
Reflect on the struggles of these courageous people : Reflect on the struggles of these courageous people. They fought and sacrificed for the opportunity to get into school.
How could threads t1 and t3 communicate : How do threads T3 and T4 communicate? How could threads T1 and T3 communicate? Explain your answer.

Reviews

len2281135

4/12/2019 1:34:26 AM

Instructions: Circle your final answers, and be sure they are on the front of the page (near the question they are the answer to), so it is clear which part they correspond to. For statistical tests, always be sure to show test statistics, degrees of freedom (when appropriate), and p values. Include units where appropriate. There are 100 points on the exam. Just as before, if you guess within 10 points of your actual score, you will receive a 5 point bonus.

len2281135

4/12/2019 1:34:19 AM

Show your work! I can’t give partial credit if the only thing I see is an incorrect final answer. You may use all available space, including the back of the page, but make sure I am able to follow your work and where it was done, and, to repeat, put your final answer on the front. You will need to use a calculator, but you should still show all the steps of your calculations, in case you mistype something along the way (and so I can tell if your answer differs because of rounding errors). Good luck!

Write a Review

Applied Statistics Questions & Answers

  Write the new regression equation

State the null hypothesis where age at enrollment is used to predict the time for completion of an RN to BSN program.

  Would you recommend a z-test or t-test

Would you recommend a Z-test or t-test? Give a reason for your answer. What is the value of the test statistic for this test?

  Thoughts on the value of statistics in general

Thoughts on the value of statistics in general

  Use a two-sample test with independent samples

Respond to the following: Give two examples as follows: A research situation when you would use a two-sample test with independent samples A research situation when you would use a two-sample test with dependent samples

  What is the mean of sampling distribution of means

The average time scheduled for a doctor's visit is 25 minutes with a standard deviation of 22 minutes.  A researcher uses a sampling distribution made up of samples of size 271.  According to the Central Limit Theore, what is the mean of sampling dis..

  Using the five steps of hypothesis testing

Using the five steps of hypothesis testing

  State the conclusion in non-technical terms

Test the claim that for the adult population of one town, the mean annual salary is givenbyμ= 30,000. Sample data are summarized as n= 17, = $22,298, and s= $14,200. Use asignificance level of α= 0.05.State the null and alternative hypotheses.

  Give the definition of a nonparametric test

1. Give the definition of a nonparametric test in your own words. 2. What is the difference between an ANOVA and the Kruskal-Wallis Test? 3. What is the difference between the Wilcoxon signed-ranks test and the Wilcoxon rank-sum test? 4. What does rs..

  Test the claim that with garlic treatment

Test the claim that with garlic treatment, the mean change in LDL cholesterol is greater than 0. What do the results suggest about the effectiveness of the garlic treatment?

  The comprehension and understanding of statistical concept

Select a statistical concept. Here is a general (somewhat broad) listing of concepts covered in the second part of your course: •Estimating a population mean with confidence: discuss confidence interval estimation, the interpretation of the inte..

  Calculate standard error of the mean

Calculate standard error of the mean. Calculate the margin of error at an 80% confidence level. Calculate the confidence interval at an 80% confidence level.

  How can you relate to the death of ivan ilych

Refer to the GCU Introduction, The Death of Ivan Ilych by Tolstoy and the three concepts of the "healing environment" found in chapters 7-9 of Called to Care: A Christian Worldview for Nursing. What is the phenomenology of illness and disease (i.e. t..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd