Characteristic of the transition and transversion mutations

Assignment Help Other Engineering
Reference no: EM13981457

PART A: Term Papers # A.1 & A.2

NEEDLEMAN and WUNSCH (NW)& SMITH-WATERMAN (SW) ALGORITHMS

Each student is assigned a sequence pair (1 & 2) as in Table TP-NW/SW Exercises. Develop a computer algorithm and build a code (MatLab or C/C++) to perform NW/SW algorithm based computationsas indicated for each student based on the comparison between the two given sequences.Elucidate the optimal path-way (that is, optimal global/local alignment).

If you are not familiar with any programming, you may do hand calculation and give your step-by-step results. Due credit will be given.

Table TP-NW/SW Exercises

Pairs

Sequence pairs for SW exercise

NW/SW

(A.1 & A.2

1

2

F F T E E Q S D I E D N C Q T

D F T Q E T E D I E D N C Q Q

NW

1

2

F R F Q N T I L D G G A E E G

F Q F Q N T I S Y Y G G E L D

SW

General Format of TPs:

You are required to supplement your answers (Term-papers) with appropriate and relevant (state-of-the-art) details plus the particulars as needed. Each of your term-paper should include the following:

• One page Executive Summary

• An elaborate description of the topic assigned with relevant references. You may supplement your answers and augment your concepts with appropriate cross-references as necessary. All such references should be clearly identified and listed in a standard format such as IEEE journal publication format. Any Web page reference can be shown by its title and web-site address. You are encouraged to append the hard copies of such references with your solutions

Term Paper # B.1 :Descriptive Study Projects

EXERCISE B.1.D

Use the following link to find the sequence of yeast clone #71020.

https://genome-www4.stanford.edu/cgi-bin/SGD/getSeq?map=a3map&seq=71020&flankl=&flankr=&rev=

Use Genscan to find the ORFs in this sequence using "Vertebrate" as your organism.

How many complete genes are there?
How many of the complete genes have introns?
How many amino acids are there in ORF #3?

Copy the predicted protein sequence from ORF #3 and use that sequence to perform an appropriate search to determine the identity of the protein and the gene that encodes it.

What is the name of the gene that encodes this protein?

Based on the gene acronym (and other information that you might have already found), in what molecular process do you suppose this gene is involved?

Locate the DNA sequence of the gene. There are many ways to do this, but all of them should get you to the same answer. As a hint to be sure you're on the right track, the first few bases of the ORF are ATGGCAAAAACG.

PAIRWISE SEQUENCE COMPARISON Dot-plot, Needleman-Wunsch (NW) and Smith-Waterman (SW) Algorithms

Using Dotlet Program

The reason why we wrote dotlet is that we needed a diagonal plot tool for the December 1998 practical sessions in bioinformatics at the Institute of Biochemistry. Since we had decided to base all the practical sessions on the World-Wide Web, we needed a program that would run in a web browser. To our knowledge, there was none, so we wrote it.

Reference: T. Junier and M. Pagni: Dotlet: diagonal plots in a Web browser,BIOINFORMATICSAPPLICATIONS NOTE, Vol. 16 no. 2 2000, Pages 178-179

Dotlet: diagonal plots in a Web browser

Problem B.1 mutational changes

Construct a matrix of the set {A, C, T, G} to illustrate the characteristic of the transition and transversion mutations.

(Hint: You may use a score of 100 % to depict the element of the matrix pertinent to no mutation and use prorated percentages to represent other elements illustrating the characteristic as above. The spontaneous base substitutions ratio of transitions to transvGiersions is approximately 2:1. Therefore each transition should have a probability of 2/3and each transversion 1/3).

 Problem # C.1

For the two binary sequences X and Y, indicated above in Problem C. 13, plot the Kulback-Leibler (KL) measure between the strings. Hence confirm the most common substring locations between them as decided via HD measure in the previous problem.

(Hint: Again, select a window of size 4. For a given sequence in each window, calculate KL measure. Plot window # versus KL = KL1 + KL2 for each string

KL1 = (p(0)loge[(p(0)/q(1)])window#1 + ....

KL2 = (q(1)loge[(q(1)/p(0)])window#1 + ....

p(0): Probability of 0 in that window; q(1): Probability of 1 in that window)

Problem D.1

Construct a dot-plot for the following pair of sequences using the matrix methods described in the example:

x:         G T G A C C G C T A A C C T C

y          G T T G C GA C T G C G G C G T

Problem D.1(A):

Construct a dot-plot for the following pair of sequences using the dotlet or any other compatible program available as an open source

x:         G T G A C C G C T A A C C T CA C G T T A C

y          T T T G C GA C T G C G G C G T C C C T A A G C

-----------------------------------------------------------------------------------------------------------------------

Problem  D. 2

Assigned is a pair of amino acid sequences (S and T). Determine the best global alignment

S: C U U A C G C A

T: A U G A G A A C U U  

Problem D.3

Given a sequence pair,X and Y as indicated below, determine the best global alignment via trace-back using NW algorithm

X:        G A GC A                              Y:        G A T T C A 

Problem D.4

Given a sequence pairs,U and V as indicated below, determine the best global alignment via trace-back using NW algorithm

U:        C T C G T                               V:        C TA A G T 

Problem D.5

Via hand calculations, perform NW-algorithm based comparison between the two given sequences indicated belowand elucidate the maximum path-way:

MA V R K L S L E G

M S T A L P G L G S

Problem D.5(A):

Via hand calculations, perform NW-algorithm based comparison between the two given sequences and elucidate the maximum path-way:

Sequence Pair

W F G Q E T S A I S

SF T Q F S E D A I

Problem D.6

Given a sequence pairs,X and Y as shown below, determine the best local alignment via trace-back using SW algorithm.

X:        W R N D C Q E G S A          Y:         W G Q E G S I E A

Problem D.6(A) :

Given a sequence pairs,U and V as shown below, determine the best local alignment via trace-back using SW algorithm.

U:        AASTHECWCTWH              V:        AASRNPSCWTTWHT

Problem D.6(B) : Via hand calculation, perform SW-algorithm based comparison between the two given sequences and elucidate the common regions of similarity

Sequence Pair

WY G Q E Q S Y I Q

WY T Q E T S D I Q

Problem E.1

Translate the following regular expressions:

(a) [GA]-T-{C, G}(2)-X-[TGC]-G(3)-[TC]

(b) [TCG]-{A, C}(3)-P-x-[ATG]-x-[VIL]-[IVT]-x-[GS]-G-Y-S-[QL]-A

(c) [TAG]-XXAG-V-X(4)-{AEGD}-[AC]-x-V-x(4)-{ED}

(d) Write regular expression to match each string in the C terminus:

V or L, any (two to four times), A, T, any but D or E

Problem E.2

(a) For the following set of multiple sequence alignment, construct the regular expression and expand it in terms of 3-letter code for amino acids:

T

E

C

V

L

A

R

T

I

N

G

P

V

L

A

R

T

I

N

G

P

T

I

T

R

T

I

N

G

A

V

M

M

R

T

I

A

E

C

V

I

C

R

T

I

K

E

C

V

I

C

R

T

I

A

E

C

T

I

C

R

T

S

N

P

C

V

I

A

R

T

T

K

E

E

V

M

M

R

T

I

(b) For the following set of multiple sequence alignment, construct the regular expressionand expand it in terms of the relevant nucleotide bases

T

C

C

T

G

A

C

A

G

T

G

C

G

G

A

T

A

G

C

C

G

T

C

T

C

T

C

A

G

C

G

G

A

C

T

G

G

T

G

T

G

A

T

G

A

A

C

C

T

G

A

C

T

G

C

G

C

T

A

A

C

T

G

A

G

C

G

G

A

C

T

G

A

C

C

G

G

G

T

T

G

Problem F.1

Using the UPGMA concept, construct an evolutionary tree for the data on pairwise species differences indicated in the following table:

OUT

A

B

C

D

E

A

0

5

30

45

35

B

 

0

28

42

32

C

 

 

0

10

15

D

 

 

 

0

20

E

 

 

 

 

0

Problem F.2

Using the UPGMA concept, construct an evolutionary tree for the data on pairwise species differences indicated in the following table:

OUT

A

B

C

B

4

 

 

C

4

2

 

D

 8

 8

 6

Problem  F.3

OUT

H

C

G

O

A

H

0

95

110

185

205

C

 

0

118

195

220

G

 

 

0

190

215

O

 

 

 

0

215

A

 

 

 

 

0

 Using the data as above, construct an un-rooted tree formulating the lengths of branches from the common ancestral node.

Problem F. 4

Neighbor-Joining Method

Given the following state of evolutionary distances, create a distance matrix on the resulting taxa

OUT

A

B

C

D

E

B

5

 

 

 

 

C

10

20

 

 

 

D

 15

25

 35

 

 

E

45

55

60

65

 

F

70

75

80

85

90

Hint:

Calculate the new distance matrix (m) for each pair of nodes.

m (i, j) = d(j) - [r(i)] + r(j)/(N - 2) where N is the number of taxa

Problem F.5

Trace the path for each sequence in HMM for the given MSA

AB - CDE

ABGCDE

AB - C- E

Hint:

HMMs and their variants have been used in gene prediction, pairwise and multiple sequence alignment, base-calling, modeling DNA sequencing errors, protein secondary structure prediction, ncRNA identification, RNA structural alignment,acceleration of RNA folding and alignment, fast noncoding RNA annotation, and many others.

Simulates a multiple sequence alignment of specified length. Deals with base-substitution only, not indels.

Reference no: EM13981457

Questions Cloud

Decision-analysis course : After hearing about your decision-analysis course, he asks you whether you have learned anything that might help him in his decision. What kinds of is sues are important in deciding whether to buy a retail business? Describe how he might use sensi..
An economic consulting firm has estimated : You compete with many firms offering similar products (monopolistic competition). An economic consulting firm has estimated the own-price elasticity for your most profitable product is -1.50. Your marginal cost is constant at $75 across most of your ..
Affirmative action programs or anti-discrimination policies : Some argue that the government doesn't need affirmative action programs or other anti-discrimination policies because the profit motive provides sufficient motivation to eliminate discrimination in employment. Explain why this might NOT happen i.e. w..
Identify the facts that establish those elements : With regard to location: 121 Apple Street, is Doe guilty of any crime(s)? If so, what crime(s)? If not, what elements are missing that, if present, would result in him being guilty? With regard to the elements that are present, identify the fac..
Characteristic of the transition and transversion mutations : Construct a dot-plot for the pair of sequences using the matrix methods and determine the best global alignment via trace-back using NW algorithm - Perform SW-algorithm based comparison between the two given sequences and elucidate the common regions..
Maximum annual cost of debt the company can borrow : If a company has a required WACC of 10% per year. it's stocks are expected to have a 16% rate of return. The capital structure of the company has to be 65% debt and 35% equity. the income tax rate of the company is 30%. What is the maximum annual cos..
What is maximum number of maximums we will see on screen : A light with a wavelength of 500nm is incident on a double slit opening with a width of 40 microm. If the screen is 0.9m away from the open- what is the maximum number of maximums we will see on the screen?
Charge of ordering products but do not set the price : Suppose you manage a convenience mart and are in charge of ordering products but do not set the price. The home office provides the prices. In your area, the income elasticity of demand for peanut butter is -.05. Due to local factory closings, you ex..
How far horizontally from launch point is buliding located : Draw a suitable diagram, defining your variables and coordinate system. Express the velocity vector of the stone at t=0 in i(hat), j(hat) format. How far horizontally from the launch point is the buliding located.

Reviews

Write a Review

Other Engineering Questions & Answers

  What is the nyquist frequency

What is the Nyquist frequency for this example. Try different values of the sampling frequency and examine the effect on the spectrum of the signal.

  How antenna polarization can affect reception of rf signals

Compare and contrast different IEEE and Wireless LAN standards among 802.11a, 802.11b, 802.11g, and 802.11n - Explain how antenna polarization can affect the reception of RF signals.

  Neglecting the weight of the cables

Consider a vertical elevator whose cabin has a total mass of 800 kg when fully loaded and 150 kg when empty. The weight of the elevator cabin is partially balanced by a 400-kg counterweight that is connected to the top of the cabin by cables that ..

  Prove that the points are edges of a square

Given four points in Cartesian coordinates as S(4,0,0),t(0,4,-4),U(8,4,-4)and v(4,8,-8) prove that the points are edges of a square. all distances are

  Determine the maximum tension

Express the tension in the rod at a distance z from end B in terms of z, m, g, and L for the vertical position.  Knowing that the weight of the rod is 2 lb, determine the maximum tension in the rod for the vertical position.

  Explain electronic communications privacy act

Research by finding an article or case study discussing ONE of the following laws or legal issues as it relates to computer forensics- Electronic Communications Privacy Act (ECPA)

  What is communication overhead in networking

what is communication overhead in networking?what are the types of overhead?How to reduce communication overhead in broadcasting?

  It is important to include who worked on what questions on

working with your group detail the steps needed to monitor administrate and secure the network in regard to online

  Which is the preferred environmental condition

Which is the preferred environmental condition for handling electronic components that are ESD sensitive -  Higher relative humidity (RH) environment

  Determine the interface trap density change

Why do the low frequency and high frequency CV curves differ in inversion and determine the interface trap density change

  Henry clements car rental agency case study

How many cars will need to be moved and what will the total cost of the move be?

  Develop the ability to build and present a business plan

Develop the ability to build and present a business plan for a new engineering ventures, including; financial, supply chain, development team, marketing and production.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd