Characteristic of the transition and transversion mutations

Assignment Help Other Engineering
Reference no: EM13981457

PART A: Term Papers # A.1 & A.2

NEEDLEMAN and WUNSCH (NW)& SMITH-WATERMAN (SW) ALGORITHMS

Each student is assigned a sequence pair (1 & 2) as in Table TP-NW/SW Exercises. Develop a computer algorithm and build a code (MatLab or C/C++) to perform NW/SW algorithm based computationsas indicated for each student based on the comparison between the two given sequences.Elucidate the optimal path-way (that is, optimal global/local alignment).

If you are not familiar with any programming, you may do hand calculation and give your step-by-step results. Due credit will be given.

Table TP-NW/SW Exercises

Pairs

Sequence pairs for SW exercise

NW/SW

(A.1 & A.2

1

2

F F T E E Q S D I E D N C Q T

D F T Q E T E D I E D N C Q Q

NW

1

2

F R F Q N T I L D G G A E E G

F Q F Q N T I S Y Y G G E L D

SW

General Format of TPs:

You are required to supplement your answers (Term-papers) with appropriate and relevant (state-of-the-art) details plus the particulars as needed. Each of your term-paper should include the following:

• One page Executive Summary

• An elaborate description of the topic assigned with relevant references. You may supplement your answers and augment your concepts with appropriate cross-references as necessary. All such references should be clearly identified and listed in a standard format such as IEEE journal publication format. Any Web page reference can be shown by its title and web-site address. You are encouraged to append the hard copies of such references with your solutions

Term Paper # B.1 :Descriptive Study Projects

EXERCISE B.1.D

Use the following link to find the sequence of yeast clone #71020.

https://genome-www4.stanford.edu/cgi-bin/SGD/getSeq?map=a3map&seq=71020&flankl=&flankr=&rev=

Use Genscan to find the ORFs in this sequence using "Vertebrate" as your organism.

How many complete genes are there?
How many of the complete genes have introns?
How many amino acids are there in ORF #3?

Copy the predicted protein sequence from ORF #3 and use that sequence to perform an appropriate search to determine the identity of the protein and the gene that encodes it.

What is the name of the gene that encodes this protein?

Based on the gene acronym (and other information that you might have already found), in what molecular process do you suppose this gene is involved?

Locate the DNA sequence of the gene. There are many ways to do this, but all of them should get you to the same answer. As a hint to be sure you're on the right track, the first few bases of the ORF are ATGGCAAAAACG.

PAIRWISE SEQUENCE COMPARISON Dot-plot, Needleman-Wunsch (NW) and Smith-Waterman (SW) Algorithms

Using Dotlet Program

The reason why we wrote dotlet is that we needed a diagonal plot tool for the December 1998 practical sessions in bioinformatics at the Institute of Biochemistry. Since we had decided to base all the practical sessions on the World-Wide Web, we needed a program that would run in a web browser. To our knowledge, there was none, so we wrote it.

Reference: T. Junier and M. Pagni: Dotlet: diagonal plots in a Web browser,BIOINFORMATICSAPPLICATIONS NOTE, Vol. 16 no. 2 2000, Pages 178-179

Dotlet: diagonal plots in a Web browser

Problem B.1 mutational changes

Construct a matrix of the set {A, C, T, G} to illustrate the characteristic of the transition and transversion mutations.

(Hint: You may use a score of 100 % to depict the element of the matrix pertinent to no mutation and use prorated percentages to represent other elements illustrating the characteristic as above. The spontaneous base substitutions ratio of transitions to transvGiersions is approximately 2:1. Therefore each transition should have a probability of 2/3and each transversion 1/3).

 Problem # C.1

For the two binary sequences X and Y, indicated above in Problem C. 13, plot the Kulback-Leibler (KL) measure between the strings. Hence confirm the most common substring locations between them as decided via HD measure in the previous problem.

(Hint: Again, select a window of size 4. For a given sequence in each window, calculate KL measure. Plot window # versus KL = KL1 + KL2 for each string

KL1 = (p(0)loge[(p(0)/q(1)])window#1 + ....

KL2 = (q(1)loge[(q(1)/p(0)])window#1 + ....

p(0): Probability of 0 in that window; q(1): Probability of 1 in that window)

Problem D.1

Construct a dot-plot for the following pair of sequences using the matrix methods described in the example:

x:         G T G A C C G C T A A C C T C

y          G T T G C GA C T G C G G C G T

Problem D.1(A):

Construct a dot-plot for the following pair of sequences using the dotlet or any other compatible program available as an open source

x:         G T G A C C G C T A A C C T CA C G T T A C

y          T T T G C GA C T G C G G C G T C C C T A A G C

-----------------------------------------------------------------------------------------------------------------------

Problem  D. 2

Assigned is a pair of amino acid sequences (S and T). Determine the best global alignment

S: C U U A C G C A

T: A U G A G A A C U U  

Problem D.3

Given a sequence pair,X and Y as indicated below, determine the best global alignment via trace-back using NW algorithm

X:        G A GC A                              Y:        G A T T C A 

Problem D.4

Given a sequence pairs,U and V as indicated below, determine the best global alignment via trace-back using NW algorithm

U:        C T C G T                               V:        C TA A G T 

Problem D.5

Via hand calculations, perform NW-algorithm based comparison between the two given sequences indicated belowand elucidate the maximum path-way:

MA V R K L S L E G

M S T A L P G L G S

Problem D.5(A):

Via hand calculations, perform NW-algorithm based comparison between the two given sequences and elucidate the maximum path-way:

Sequence Pair

W F G Q E T S A I S

SF T Q F S E D A I

Problem D.6

Given a sequence pairs,X and Y as shown below, determine the best local alignment via trace-back using SW algorithm.

X:        W R N D C Q E G S A          Y:         W G Q E G S I E A

Problem D.6(A) :

Given a sequence pairs,U and V as shown below, determine the best local alignment via trace-back using SW algorithm.

U:        AASTHECWCTWH              V:        AASRNPSCWTTWHT

Problem D.6(B) : Via hand calculation, perform SW-algorithm based comparison between the two given sequences and elucidate the common regions of similarity

Sequence Pair

WY G Q E Q S Y I Q

WY T Q E T S D I Q

Problem E.1

Translate the following regular expressions:

(a) [GA]-T-{C, G}(2)-X-[TGC]-G(3)-[TC]

(b) [TCG]-{A, C}(3)-P-x-[ATG]-x-[VIL]-[IVT]-x-[GS]-G-Y-S-[QL]-A

(c) [TAG]-XXAG-V-X(4)-{AEGD}-[AC]-x-V-x(4)-{ED}

(d) Write regular expression to match each string in the C terminus:

V or L, any (two to four times), A, T, any but D or E

Problem E.2

(a) For the following set of multiple sequence alignment, construct the regular expression and expand it in terms of 3-letter code for amino acids:

T

E

C

V

L

A

R

T

I

N

G

P

V

L

A

R

T

I

N

G

P

T

I

T

R

T

I

N

G

A

V

M

M

R

T

I

A

E

C

V

I

C

R

T

I

K

E

C

V

I

C

R

T

I

A

E

C

T

I

C

R

T

S

N

P

C

V

I

A

R

T

T

K

E

E

V

M

M

R

T

I

(b) For the following set of multiple sequence alignment, construct the regular expressionand expand it in terms of the relevant nucleotide bases

T

C

C

T

G

A

C

A

G

T

G

C

G

G

A

T

A

G

C

C

G

T

C

T

C

T

C

A

G

C

G

G

A

C

T

G

G

T

G

T

G

A

T

G

A

A

C

C

T

G

A

C

T

G

C

G

C

T

A

A

C

T

G

A

G

C

G

G

A

C

T

G

A

C

C

G

G

G

T

T

G

Problem F.1

Using the UPGMA concept, construct an evolutionary tree for the data on pairwise species differences indicated in the following table:

OUT

A

B

C

D

E

A

0

5

30

45

35

B

 

0

28

42

32

C

 

 

0

10

15

D

 

 

 

0

20

E

 

 

 

 

0

Problem F.2

Using the UPGMA concept, construct an evolutionary tree for the data on pairwise species differences indicated in the following table:

OUT

A

B

C

B

4

 

 

C

4

2

 

D

 8

 8

 6

Problem  F.3

OUT

H

C

G

O

A

H

0

95

110

185

205

C

 

0

118

195

220

G

 

 

0

190

215

O

 

 

 

0

215

A

 

 

 

 

0

 Using the data as above, construct an un-rooted tree formulating the lengths of branches from the common ancestral node.

Problem F. 4

Neighbor-Joining Method

Given the following state of evolutionary distances, create a distance matrix on the resulting taxa

OUT

A

B

C

D

E

B

5

 

 

 

 

C

10

20

 

 

 

D

 15

25

 35

 

 

E

45

55

60

65

 

F

70

75

80

85

90

Hint:

Calculate the new distance matrix (m) for each pair of nodes.

m (i, j) = d(j) - [r(i)] + r(j)/(N - 2) where N is the number of taxa

Problem F.5

Trace the path for each sequence in HMM for the given MSA

AB - CDE

ABGCDE

AB - C- E

Hint:

HMMs and their variants have been used in gene prediction, pairwise and multiple sequence alignment, base-calling, modeling DNA sequencing errors, protein secondary structure prediction, ncRNA identification, RNA structural alignment,acceleration of RNA folding and alignment, fast noncoding RNA annotation, and many others.

Simulates a multiple sequence alignment of specified length. Deals with base-substitution only, not indels.

Reference no: EM13981457

Questions Cloud

Decision-analysis course : After hearing about your decision-analysis course, he asks you whether you have learned anything that might help him in his decision. What kinds of is sues are important in deciding whether to buy a retail business? Describe how he might use sensi..
An economic consulting firm has estimated : You compete with many firms offering similar products (monopolistic competition). An economic consulting firm has estimated the own-price elasticity for your most profitable product is -1.50. Your marginal cost is constant at $75 across most of your ..
Affirmative action programs or anti-discrimination policies : Some argue that the government doesn't need affirmative action programs or other anti-discrimination policies because the profit motive provides sufficient motivation to eliminate discrimination in employment. Explain why this might NOT happen i.e. w..
Identify the facts that establish those elements : With regard to location: 121 Apple Street, is Doe guilty of any crime(s)? If so, what crime(s)? If not, what elements are missing that, if present, would result in him being guilty? With regard to the elements that are present, identify the fac..
Characteristic of the transition and transversion mutations : Construct a dot-plot for the pair of sequences using the matrix methods and determine the best global alignment via trace-back using NW algorithm - Perform SW-algorithm based comparison between the two given sequences and elucidate the common regions..
Maximum annual cost of debt the company can borrow : If a company has a required WACC of 10% per year. it's stocks are expected to have a 16% rate of return. The capital structure of the company has to be 65% debt and 35% equity. the income tax rate of the company is 30%. What is the maximum annual cos..
What is maximum number of maximums we will see on screen : A light with a wavelength of 500nm is incident on a double slit opening with a width of 40 microm. If the screen is 0.9m away from the open- what is the maximum number of maximums we will see on the screen?
Charge of ordering products but do not set the price : Suppose you manage a convenience mart and are in charge of ordering products but do not set the price. The home office provides the prices. In your area, the income elasticity of demand for peanut butter is -.05. Due to local factory closings, you ex..
How far horizontally from launch point is buliding located : Draw a suitable diagram, defining your variables and coordinate system. Express the velocity vector of the stone at t=0 in i(hat), j(hat) format. How far horizontally from the launch point is the buliding located.

Reviews

Write a Review

Other Engineering Questions & Answers

  Iron loss and copper loss

What is the different between iron loss and copper loss?

  Baseband-signal transmission

By considering a full-load sinusoidal modulating wave, show that PAM and baseband-signal transmission have equal signal-to-noise ratios for the same average transmitted power.

  History of super minis in the european market

A 300 word (-10%) executive summary. This information must be on its own page and be the first page after the cover pages.

  What was the historical scope of telecommunications

What was the historical scope of telecommunications?

  Prepare a profit and loss statement for farming operation

Prepare a Profit and Loss Statement for the farming operation and determine the net profit (loss) before income taxes.

  Environmental engineeringquestion 1 analysis of the

environmental engineeringquestion 1 analysis of the movement of a tracer in a contaminated aquifer indicates that the

  Find an expression for the torque

The wye joint in Fig splits the pipe flow into equal amounts Q/2, which exit, as shown, a distance Ro from the axis. Neglect gravity and friction. Find an expression for the torque T about the x axis required to keep the system rotating at angular..

  The significance of a voltage amplifier and its applications

Describe how negative feedback works, and explain some of its advantages.

  Stream of water flowing

Figure shows a stream of water flowing through a hole at depth h = 1.0 cm in a tank holding water to height H = 40 cm.

  Calculate the fracture stress for a crack length

Calculate the fracture stress (σc) for a crack length (2a) of 50 mm, assuming plane strain conditions. If conditions were plane stress instead.

  What was the acceptable use policy in place on the internet

answer the following questions from a. what was the acceptable use policy in place on the internet before 1995? b. why

  Explain the queueing system at dino-ville airport

Explain the queueing system at Dino-ville Airport by defining the customers, arrival rate, server, service rate and use Kendall's notation to define the queueing system.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd