Examine multiple variation parameters for a genomic region , Biology

Assignment Help:
  1. Determine SNP variation among the aligned DNAs for a genomic region.   See below for how to count SNP variation.  The output file (Your_name_snp.txt) should have two columns of numbers.  The first column will indicate total number of SNP sites per species and the second will be the percent of sequences/species having that same number of variant nucleotides.
  2. Determine in-del variation among the aligned DNAs for a genomic region. The output file (Your_name_in_del.txt) should be two columns of numbers.  The first column will indicate total number of in-del sites per species and the second will be the percent of sequences/species having that same number of in-del.
  3. Determine overall variation (SNPs and in-dels) among the aligned DNAs for a genomic region. The output file (Your_name_both.txt) two columns of numbers.  The first column will indicate total number of variant sites (SNP and in-del) per species and the second will be the percent of sequences/species having that same number of variant nucleotides.  This will generate the same data used for the figure on page 3.

Sample Alignment: 48 bases,  differences are highlighted

Seq1      ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC

Seq2      AAAAATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC

Seq3      AAAAATGCATGCATGCA-GCATGCATGCATGCATGCATGCATGCATGC

Seq4      AAAAATGCATGCATGCA-GCATGCATGCATTTTTGCATGCATGCATGC

Seq5      AAAAATGCATGCATGCA-GCATGCATGCATTTTTGCAT-CATGCATGC

Computation:  Compare Seq1 to 2,3,4, and 5 you find the differences (SNPs and InDels).

Seq1:Seq1 = 0 changes

Seq1:Seq2 = 3 changes

Seq1:Seq3 = 4 changes

Seq1:Seq4 = 7 changes

Seq1:Seq5 = 8 changes

 Repeat using each of the other sequences as the basis for comparison

Seq2:Seq1 = 3 changes                  Seq3:Seq1 = 4 changes

Seq2:Seq2 = 0 changes                  Seq3:Seq2 = 1 changes

Seq2:Seq3 = 1 changes                  Seq3:Seq3 = 0 changes

Seq2:Seq4 = 4 changes                  Seq3:Seq4 = 3 changes

Seq2:Seq5 = 5 changes                  Seq3:Seq5 = 4 changes

 

Seq4:Seq1 = 7 changes                  Seq5:Seq1 = 8 changes

Seq4:Seq2 = 4 changes                  Seq5:Seq2 = 5 changes

Seq4:Seq3 = 3 changes                  Seq5:Seq3 = 4 changes

Seq4:Seq4 = 0 changes                  Seq5:Seq4 = 1 changes

Seq4:Seq5 = 1 changes                  Seq5:Seq5 = 0 changes

 

Our input file is a FASTA format file of all sequences/species that has been previously aligned and trimmed.  There are some odd characters in the file, so we'll have to deal with that.


Related Discussions:- Examine multiple variation parameters for a genomic region

Limitation of eia, EIA suffers from following limitations: (i)     ELA s...

EIA suffers from following limitations: (i)     ELA should be undertaken at the policy and planning level rather than at the project level. (ii)   Range of possible alternati

Define the root perforation, Define the Root Perforation An endodontic ...

Define the Root Perforation An endodontic perforation is an artificial opening in the tooth or its root, created by clinician during entry to the canal system or by a biologic

Medical management of chronic renal failure, Medical Management   In ch...

Medical Management   In chronic  renal failure  there  is irreversible renal failure. The goals of medical management are:   To promote maximal  renal  function  To ma

PROTOZOA, WHAT IS THE CLASSIFICATION OF PROTOZOA ..

WHAT IS THE CLASSIFICATION OF PROTOZOA ..

Changes in the ions, Changes in the Ions The concentration o...

Changes in the Ions The concentration of potassium, phosphate and nitrate declined significantly in the bathing medium within four days. The concentration of so

Explain neurochemical process, Alertness and sleep are dependent on the act...

Alertness and sleep are dependent on the activity of the brain as a whole, although different levels of consciousness are determined primarily by areas of the brain stem. A key ana

Define disadvantages of using bacteria as source of protein, Define Disadva...

Define Disadvantages of using Bacteria as a source of protein? Disadvantages of using Bacteria as a source of protein are as follow: a) If the bacterial strain is very small

Modern theory in biology, Modern Theory Naturalistic theory: Accordin...

Modern Theory Naturalistic theory: According to this theory Life originated upon our earth spontaneously from nonliving matter. But there are two significant points in this r

Explain waste produced during surgical process, Q. Explain Waste produced d...

Q. Explain Waste produced during surgical process? Some of the waste products produced during a surgical process are dressings, sponges, gloves or other soft material dripping

What is erythropoietin, What is Erythropoietin (EPO)   A.  is secreted ...

What is Erythropoietin (EPO)   A.  is secreted by peritubular interstitial cells of the kidney cortex.   B.  acts by increasing the production of red blood cells by cells in

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd