Bioinformatics for representing sequence annotation

Assignment Help Database Management System
Reference no: EM1389771

QUESTION 1

For each of the following tasks you'll use the xxxxxxxx' database. You need only provide the query or command you used within MySQL for each.

a. How many columns are in the cv table? (The command here doesn't need to return a number, just show the columns so you can count them.)

b. Which ID (cv_id) corresponds to the GO ontology stored in the database?

c. How many controlled vocabulary terms (cvterm table) are linked to the GO ontology?

d. How many entries in the feature table are linked to any of the GO terms? (see the feature_cvterm linking table.)

Question 2:

The GFF3 format is a commonly-used one in bioinformatics for representing sequence annotation. You can find the specification here:

sequenceontology.org/gff3.shtml

Vi editor as below:
______________________________________________________________________________
##gff-version 3
#date Tue Feb 8 19:50:12 2011
#
# Saccharomyces cerevisiae S288C genome
#
# Features from the 16 nuclear chromosomes labeled chrI to chrXVI,
# plus the mitochondrial genome labeled chrMito and the 2-micron plasmid.
#
# Created by Saccharomyces Genome Database
#
# Weekly updates of this file are available via Anonymous FTP from:
# ftp.yeastgenome.org/yeast/data_download/chromosomal_feature/saccharomyces_cerevisiae.gff
#

#

____________________________________________________________________________

Within the feature table another column of note is the 9th, where we can store any key=value pairs relevant to that row's feature such as ID, Ontology_term or Note.

Your task is to write a GFF3 feature exporter. A user should be able to run your script like this:

$ export_gff3_feature.pl /path/to/some.gff3 gene ID YAR003W

There are 4 arguments here that correspond to values in the GFF3 columns. In this case, your script should read the path to a GFF3 file, find any gene (column 3) which has an ID=YAR003W (column 9). When it finds this, it should use the coordinates for that feature (columns 4, 5 and 7) and the FASTA sequence at the end of the document to return its FASTA sequence.

Your script should work regardless of the parameters passed, warning the user if no features were found that matched their query. (It should also check and warn if more than one feature matches the query.)

The output should just be printed on STDOUT (no writing to a file is necessary.)

Reference no: EM1389771

Questions Cloud

Calculate the dividend yield and the capital-gain yield : calculate the dividend yield, the capital-gain yield, and the total return to the stock. Express your calculations in percentage terms.
Determine the critical values for the test : The null hypothesis is to be tested at 95% confidence. Determine the critical values for this test.
Elements in frequency histogram : When making the histogram from frequency table, (a) what goes along the bottom, (b) what goes along the left edge, and (c) what goes above each value?
Interaction of calcium with other proteins : Explain the interaction of calcium with other proteins and how this alternate control system affects the rate and duration of smooth muscle contraction.
Bioinformatics for representing sequence annotation : The GFF3 format is a commonly-used one in bioinformatics for representing sequence annotation and which ID (cv_id) corresponds to the GO ontology stored in the database and how many controlled vocabulary terms (cvterm table) are linked to the GO onto..
Estimation of the proportion of hospital referrals : What size sample would be required to estimate the proportion of hospital referrals with a margin of error of 0.04 or less at 95% confidence?
Grouped frequency table : Describe to a person who has never taken a course in statistics the meaning of a grouped frequency table.
Summarize the structural organization of dna : Provide summary the structural organization of DNA. In your answer, be certain that you identify the chemical components of the molecule, and the arrangement of the molecule
Centrifugation of a cell suspension : Assume if centrifugation of a cell suspension at a rotation speed of 1200 rpm takes three min, Determine how much time will be required to achieve the same degree of cell.

Reviews

Write a Review

Database Management System Questions & Answers

  Sketch diagram for data warehouse of shop by star schema

Assume that data warehouse for video game shop consists of th three dimensions: time, player, and game, and two measures number of games played and price paid per game. Sketch schema diagram for data warehouse using the star schema.

  Describing the purpose of database an its functionality

Describing the purpose of database an its functionality, plus a detailed E-R diagram.

  An active database in pl-sql

Did the corresponding lines for invoices 1001 and 1008 in table LINE get deleted automatically? Can you explain why?

  Write sql statement to retrieve all data sorted in order

Generate the view called RepairSummary which shows only RepairInvoiceNumber, TotalCost, and TotalPaid. Illustrate the SQL statement to retrieve all RepairSummary data sorted by TotalCost.

  Find maximum salary of employees from database table

Find the maximum salary of all employees who are not managers. Give all the managers in the database a 10 percent salary raise. Give all the other employees a 5 percent salary raise.

  Explain data for each candidate of eight constituencies

supplies % of votes each candidate is likely to receive, based on popularity rating. Actual number of votes received is that percentage of General votes. You should enter data for each candidate in each of the eight constituencies.

  Create structurally sound relational database schema

Create a structurally sound relational database schema showing the minimum number of fields, tables, and relationships between the tables.

  Explain use of sequential file over database

Explain these situations. When would the database be more beneficial than sequential file? Is it possible for two kinds of permanent storage to be used interchangeably?

  Write sql statements to calculate average salary

Write SQL statements that do the following: Calculate the average salary for all employees. Calculate the maximum salaries for exempt and non-exempt employees.

  Database to keep track of auto sales in car dealership

CAR (Serial-No, ModConsider the given relations for database which keeps track of auto sales in car dealership.

  Write three items contained in fat database

What does CHS stand for? List three items contained in the FAT database. List two features NTFS provides that FAT does not.

  Write program to ask user to enter last name of customer

Write a program that will ask the user to enter the last names of our candidates in a class officer's president election and the number of votes received each candidate.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd