Define zero order markov model for sequence

Assignment Help Advanced Statistics
Reference no: EM131008526

Question 1. Profile HMMs for sequence families

a) Define matching (M), insert (I) and delete (D) states of the multiple sequence alignment (MSA) shown in Figure 1

b) Derive parameters of profile HMM for MSA given in figure 1
I. Emission counts for match states
II. Emission counts for insert states
III. Counts of transitions between states
IV. Emission probabilities for match, insert, and hidden states

Figure 1. Multiple sequence alignment of five DNA sequences

T--CT-

-AA-TA

T--CTA

TC-G-A

C-CGAC

Feel free to use Durbin's Figure 5.7c format

2. Provide 1-1.5 page review for the paper "Genome-wide genetic marker discovery and genotyping using next-generation sequencing" available under this week's course content

Some guidelines:
- Underline main points of the paper.
- Keep your work structured.
- While focusing on big picture keep in mind our class is on statistical processes.

3. Use <HW3_solution_reviewN.pptx> file available in course content for this week tor write and submit R-script which will:

a. Define HMM model for Q4 in Homework 3
b. Parse the Homework 3 Q4 sequence to show sequence of hidden states using Viterbi algorithm:;

Homework 3 Solution Question 4: (a) Define zero order Markov model for sequence2_A2, which represents portion of non-coding sequence of Mycobacterium tuberculosis (refer to course content)
zero order for sequence2_A2:
P(A) 107 0.195255474
P(C) 156 0.284671533
P(G) 183 0.333941606
P(T) 102 0.186131387

b) Use zero order Markov models defined for sequence1_A2 and sequence2_A2 and apply Viterbi algorithm to find the most likely path for sequence CGCGTTACTTCAATG without taking frame into consideration

Assume:
Initial transition probabilities
a0c= a0n =0.5
State transition probabilities
acc 0.55
acn 0.45
ann 0.5
anc 0.5

where, aij is transition probability, c- coding, n-non-coding

sequence CGCGTTACTTCAATG
path of hidden states CCCCNNCCNNCCCCC

Attachment:- post.xlsx

Reference no: EM131008526

Questions Cloud

What is a typical value for this data set : Construct a back-to-back stem-and-leaf display for the wireless percentage of the states in the West and the states in the East. How do the distributions of wireless percentages compare for states in the East and states in the West?
Write a perl program that asks a user for a motif : Write a Perl program that asks a user for a motif (like QDSV or MKPL) and returns a message saying whether the motif is found in the sequence or not - Write a program that calculates and prints
Prepare any journal entry necessary as a direct result : Determine the amounts to be reported for each of the five items shown above from the 2009 and 2010 financial statements when those amounts are reported again in the 2009-2011 comparative financial statements.
Was the community experience better or worse than expected : Newgroveton is a community of 445,000. In the most recent year, there were 750 new cases of disease A in the community. Assume the expected incidence rate for disease A is 245 per 100,000 people. Was the community's experience better or worse than..
Define zero order markov model for sequence : Page review for the paper "Genome-wide genetic marker discovery and genotyping using next-generation sequencing" available under week's course content
The energy stored in the dielectric in joules : A dielectric slab with 500mm x 500mm cross-section is 0.4m long. The slab is subjected to a uniform electric field of E = 6ax + 8aykV /mm . The relative permittivity of the dielectric material is equal to 2. The value of constant ε0 is8.85 × 10-12F /..
What is the mad for the moving average forecast : What is the forecast for year 13 based on the 5-year moving average? What is the forecast for year 13 based on the 5-year weighted moving average? What is the MAD for the moving average forecast
Summarize this information using a comparative bar graph : Summarize this information using a comparative bar graph that shows differences between males and females within the two different age groups. Comment on the interesting features of your graphical display.
The effective capacitance across the terminals : Three capacitors C1, C2, and C3 whose values are 10µF, 5µF and 2µF respectively, have breakdown voltages of 10V, 5V and 2V respectively. For the interconnection shown, the maximum safe voltage in Volts that can be applied across the combination and t..

Reviews

Write a Review

Advanced Statistics Questions & Answers

  Determining the cost of capital of kubrick company

If the risk-free rate is 6% and the equity risk premium is 5%, calculate the cost of capital for the two firms and the combined firm. Assuming the value drivers remain constant(and revenues are simply combined), what would be the value of the comb..

  Is the wlln sufficient to argue or is the slln necessary

Let J be the number of plays until the gambler loses all his money. Is the WLLN sufficient to argue that limn→∞ Pr{J > n} = 0 (i.e., that J is a rv) or is the SLLN necessary?

  Treatment of allocated costs

Review your organization Toyota and its treatment of allocated costs. Retrieve any report in the organization that allocates common costs to a division, product, or service. Recast that report with unallocated costs and comment on the usefulness o..

  Sales and marketing career path-tip sheet

Consider the top 2-3 careers in Sales or Marketing you would like to enter one day. Do some research at places like Monster and compile some data for each of these career paths. In particular, collect salary information, experience and degree requ..

  A sample proportion from a simple random sample

What is the probability that a sample proportion from a simple random sample of 350 internet users will be 0.75 or greater?

  Breakeven analysis-aviation maintenance company

By this time, you should have an organization selected and approved for your project. In this module, we are going to have our first application to that project. Identify any activity in your organization where you can apply breakeven analysis.

  Critical analysis of scholarly article

Scholarly writers are aware of their audience and base their writing on solid evidence rather than on assumptions and/or opinions. In addition, scholarly writers must also utilize a scholarly voice.

  Details regarding analysis of variance

Suppose you are testing the differences in attitudes toward health care reform using Democrats, Republicans, and Independents. What statistical test would be the best to use and why.

  Computing various statistical measures

What was the real retun on the stock market in each year. What was the average real return? What was the risk premium in each year? What was the average risk premium?

  Canadian taxation accounting

Yvonne had employment income of 46,200, as well as income from an unincorporated business of 13,500. A rental property owned by Yvonne experienced a net loss of 2,350.

  Hiring process and affirmative action plan

Would you as head of the human resource department meet with the committee prior to any interviews? If yes, what would you say to the members?

  Effectiveness of projections and forecasts

What are the ramifications to the firm to which you are most closely aligned or are analyzing if one or more of your projections/forecasts do not hold true?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd