What is the mean number of differences per base-pair

Assignment Help Applied Statistics
Reference no: EM132281135

Experimental Design and Statistics Assignment -

Section 1: Which test?

The Human Microbiome Project analyzed the diversity of the microbial communities that live in and on the human body by taking samples from healthy individuals, sequencing the DNA of the microbes that were present in different regions of the body (see image in attached file). This allowed the identification and of the different taxa of bacteria present in each region, as well as quantifying the relative number of each taxon.

There are many statistical questions that can be addressed with these data. For each research question below, state the null and alternate hypotheses and the test you would use, including the variables to be tested. Be as specific as possible, including whether the test should be one- or two- tailed where appropriate.

A) Is there a difference in the number of bacterial taxa present in the saliva of men and women? Assume that the number of taxa present follows a normal distribution, with the same standard deviation in men and women.

H0:

HA:

Test:

B) Do people tend to harbor more bacterial taxa on the skin behind their ears or in their elbows? Assume the measurements for the left and right sides for each area are combined for each person.

H0:

HA:

Test:

C) An earlier paper proposed that individuals could be classified as belonging to one of three "enterotypes" based on the types of bacteria present in their gut. Is there is a difference in the frequencies of the three enterotypes among meat-eaters and vegetarians?

H0:

HA:

Test:

D) The enterotype hypothesis come partly from the observation that the distribution of frequencies of some bacterial groups across individuals are bimodal; for one of these taxa, Prevotella, people have either fairly high frequencies of Prevotella, or nearly undetectable levels. Few people have moderate frequencies. You want to test whether the frequency of Prevotella in the gut is affected by dietary fat levels, so you talk to a friend who has been doing an unrelated study where subjects were randomly assigned to either a high or low fat diet. You do not know what each individual's Prevotella level was before the study began, but you can measure the current level.

H0:

HA:

Test:

Section 2: Snakes and Snails

A number of snake species in south-east Asia have evolved to prey extensively or exclusively on land snails, and have evolved special morphological features to facilitate extracting snails from their shells, including jaws with many teeth to grip the slippery, slimy beasts. Most snails' shells that coil to the right, so a snake with a similar asymmetry in its own morphology might have an advantage in predation.

Researchers measured the asymmetry in snake jaws across a number of snake species by counting the number of teeth on the right and left side of the jaw (R and L, respectively) and calculating an asymmetry index: 100 × (R-L )/(R+L) . This index was normally distributed within species.

A) Why did they calculated the asymmetry index rather than just using R-L?

B) One of the snake species, Pareas iwasakii, had a mean asymmetry index in a sample of 28 snakes of 17.5, with a standard deviation of 8.5. Perform an appropriate test to determine if P. iwasakii shows significant asymmetry in tooth number. Be sure to clearly state your conclusions.

C) To test whether an asymmetrical jaw was helpful in predation against coiled snails, the researchers tested snake predation success on a number of different snails with either left- or right-handed shells. The snakes were scored based on the frequency with which they successfully extracted and ate the snails. Each snake was tested only on one type of shell. Using the data below, test whether the snakes are better at extracting snails with coils of one direction or the other. You may assume the extraction frequencies follow a normal distribution in each group.

 

left-handed shells

 

right-handed shells

Success Rate (%)

80

68

57

79

82

92

91

75











D) To improve the experiment above, a scientist decides to (1) test more snakes, (2) have each snake attempt to open both left and right handed shells. To simplify planning, she tests each snake (3) first on left-handed shells, then on right handed shells. Describe the effects on sampling error and/or bias for each of the three modifications.

Section 3: False and False

All of the following statements are false. Please correct the statement or explain the error.

A) According to the Central Limit Theorem, the larger the sample size, closer a sample's distribution will be to the normal distribution.

B) An experiment with a larger sample size will always be more accurate, with less bias, than one with smaller sample.

C) A scientist observed grizzly bears fishing for salmon in a stream. After the bear has left, she collects the fish carcasses and measures the jawbones of the fish to estimate their sizes. In a sample of 10 fish, she finds a mean jawbone length of 6.8 cm, with a standard deviation of 1.2 cm. Assuming jaw lengths in the population are normally distributed, her 95% confidence interval for the mean is 6.05 - 7.54 cm.

D) 6.8 cm is an unbiased estimate of the mean jaw length of the salmon in the stream.

E) In a case-control study of rates of smoking and lung cancer in Beijing, 126 of 226 smokers were found to have lung cancer, as compared to 35 lung cancer cases in a sample of 96 non-smokers. This means the odds ratio for lung cancer associated with smoking (in Beijing) is 1.53.

F) The odds ratio for lung cancer associated with smoking is much lower in China than in the United States. This suggests that lung cancer rates in China are lower than in the United States.

Section 4: Oh, reporters!

Article - Putting a Value to 'Real' in Medical Research By NICHOLAS BAKALAR.

The first paragraph and the last one are mostly okay, though I might have some quibbles. The real trouble is in that middle paragraph.

A) Rewrite the first sentence of the second paragraph to make it accurate.

B) The last sentence of the second paragraph implies that a p-value of 0.06 indicates that a study's results "were probably due only to chance." Why is this incorrect?

C) If we actually wanted to quantify the probability that the results of an experiment were due to chance, what probability would we need to know in addition to the p-value?

Section 5: Elephant Evolution

Recently, it was discovered that African elephants, previously classified as one species, are actually two distinct species: the African forest elephant and the African savannah elephant. You want to get a sense of the rate at which differences have accumulated in DNA between the forest and savannah elephants, so you sequence 1000 base pair segments of DNA from each of 100 genetic regions in the two species, and count the number of differences between the species. The results appear below.

Differences

Number of regions

Expected number of regions

0

34

24.91

1

35

34.62

2

17

24.06

3

6

11.15

4

3

3.87

5

2

1.08

8

1

0.00861

9

1

0.00133

13

1

0.000000289

A) What is the mean number of differences per base-pair between the two species?

B) I have pre-calculated the expected values for the number of regions with a given number of differences. What distribution did I use? What is the null hypothesis associated with this set of expected values?

C) Perform the appropriate test of the null hypothesis and report your results.

D) By sequencing the Asian elephant and wooly mammoth, it is sometimes possible to identify whether a mutation that separates the two African species occurred in the ancestors of the forest elephants or savannah elephants. Looking at those mutations, we can then classify them by whether the mutation occurred in an amino acid coding region or between genes (intergenic). If 8 of 56 mutations that occurred in the forest elephant are from coding regions and 9 of 48 mutations in the savannah elephant are from coding regions, do the two species differ in the proportion of their mutations that occur in coding regions? Perform the appropriate test and report your results.

Attachment:- Assignment Files.rar

Reference no: EM132281135

What does population proportion tell us about the population

What does a population proportion tell us about the population? Explain the difference between p and pˆ. What is meant when a public opinion poll's margin of error is 3

Identify correct control chart

Identify Correct Control Chart: Number of employee needle-stick injuries per month. Identify Correct Control Chart: Total number of procedures performed each month by procedur

Calculate the trend-cycle and monthly seasonal indices

Use a classical multiplicative decomposition to calculate the trend-cycle and monthly seasonal indices. Do the results support the graphical interpretation from part (a)?

Cases of relays and capacitors

How many cases of relays and capacitors should Harkin Electronics produce during that period? If your answer is in fractional units of cases that is acceptable - do not round

Describe data collection and analysis methods

HI6007 Business Research Report Proposal: Initial Research Proposal. The initial research proposal will consist of the following SIX (6) items: Identify a business research to

Nation had a higher incidence rate for polio

A particular district of your nation had a higher incidence rate for polio last year compared to the rest of the country. That region has been traditionally been resistant to

Glendale westgate restaurant location

John and Mario, owners of Mi Casa Front Porch Restaurant are planning a big Super Bowl event for their Glendale Westgate restaurant location and have received a permit for add

Ensure the accuracy of the loading mechanisms

Kellogg's cereal company regularly tests its production line to ensure the accuracy of the loading mechanisms.  From a population of cereal boxes marked "12 ounces," a random

Reviews

len2281135

4/12/2019 1:34:26 AM

Instructions: Circle your final answers, and be sure they are on the front of the page (near the question they are the answer to), so it is clear which part they correspond to. For statistical tests, always be sure to show test statistics, degrees of freedom (when appropriate), and p values. Include units where appropriate. There are 100 points on the exam. Just as before, if you guess within 10 points of your actual score, you will receive a 5 point bonus.

len2281135

4/12/2019 1:34:19 AM

Show your work! I can’t give partial credit if the only thing I see is an incorrect final answer. You may use all available space, including the back of the page, but make sure I am able to follow your work and where it was done, and, to repeat, put your final answer on the front. You will need to use a calculator, but you should still show all the steps of your calculations, in case you mistype something along the way (and so I can tell if your answer differs because of rounding errors). Good luck!

Write a Review

 
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd