Write perl code to process and analyze the sequence data

Assignment Help Programming Languages
Reference no: EM131469714

1. Download and decompress the sequence data of chromosome 22.

2. Write Perl code to process and analyze the sequence data file downloaded.
 a. Read in the data from the file
 b. Use regular expression to extract the sequence from the file.
 c. Remove non-ATGC characters from the sequence
 d. Extract all the open reading frames (ORFs) from the whole sequence.
     - An ORF is a part of DNA sequence that has the potential to be translated.
     - Its length should be a multiple of 3.
    - ORFs are defined as those subsequences which have a start codon 'ATG' and any of the three stop codons 'TAA', 'TAG' and 'TGA'.Each codon includes threenucleotides.
     - In addition to start and stop codon, ORFs extracted here should have 30-90 nucleotides.
     - Only one stop codon is allowed in each ORF.
 e. Print out a message showing how many open reading frames are found in the screen.
 f. Translate each ORF into amino acid sequence using the subroutines provided.
 g. Write all found ORFsto a new data file.
 h. Write all translated amino acid sequence into another new data file.

Importantnotes:
    2.a - 2.ishould be done in one .pl file.
     Please use the subroutines provided to perform the translation.
     Once I implement your function, I would expect to input the data file name from the screen. Shown below is an example.

2073_Figure.jpg

Your output file in step 2.hand 2.i should automatically be created. Below is the example of 2.h output file.

1417_Figure1.jpg

Support information:

1. To write an array into a file, where each entry is shown in one line, use the following command:

print MFILE "$_\n" for @ORFs;

MFILE is the handle for output file.
@ORFs is the array.

2. If you have groups in your pattern
my $string = "TTTATGTGCTGCTAAAAA";
@matches = $string =~ m/^(ATG).*(TAA)$/g

=> With parenthesis () surrounding the subpattern "ATG" ad "TAA", substrings matching the "ATG" and "TAA" part will also be returned.
=> values in @matches will be ("ATGTGCTGCTAA", "ATG", "TAA")
=> Add ?: in the front, such as m/^(?:ATG).*(?:TAA)$/g, substring matching "ATG" and "TAA" will not be returned.
=> In this case the output will be ("ATGTGCTGCTAA")

Download Sequence data of chromosome 22

https://www.dropbox.com/s/6v5whj22boa3kur/hs_ref_GRCh38.p7_chr22.fa.gz?dl=0

Reference no: EM131469714

Questions Cloud

Define the sampling plan : Consider the following double sampling plan. First select a sample of 5 from a lot of 100. If there are four or more defectives in the sample, reject the lot.
What is the probability that a lot passes the inspection : For the double sampling plan described in Problem, determine the following: The probability that the lot is rejected based on the first sample.
Graph the acceptance and rejection regions : A manufacturer of aircraft engines uses a sequential sampling plan to accept or reject incoming lots of microprocessors used in the engines. Assume an AQL of 1.
What proportion of compactor bags will not meet requirements : The tensile strength of a heavy-duty plastic bag used in trash compactors is normally distributed with mean 150 pounds per square inch and standard deviation.
Write perl code to process and analyze the sequence data : Programming for Science Informatics - Write Perl code to process and analyze the sequence data file - Write all translated amino acid sequence
Develop strategic objectives for your division of business : Develop the strategic objectives for your new division of the existing business in a balanced scorecard format in context of key trends, assumptions, and risks.
What fraction of the applicants are denied on this basis : A credit rating company recommends granting of credit cards based on several criteria. One is annual income. If the annual income of applicants is normally.
Find probability that normal variable exceed two-sigma limit : What is the probability that a normal variable exceeds two-sigma limits? (That is, what is the probability of observing a value of the random variable larger).
Write from perspective of a scholar who observes about case : To do this, write from the perspective of a scholar who observes and researches about the case. Therefore, first person should be avoided.

Reviews

Write a Review

Programming Languages Questions & Answers

  Manage listing of missionaries on staff at the organization

In this assignment, you are to design an application, which will manage a listing of missionaries on staff at the Organization Go and Tell.

  Write a program to find the common songs in these lists

Write a program to find the common songs in these lists. Let user enter the list sizes n and m and the songs. While testing your program for submission, make sure that the lists are different but they do intersect in some songs.

  Create a multi-threaded competition

Create a multi-threaded competition in which opposing Robin Hoods will attack one another and try to take each other's gold coins.

  List the assembly language program

List the assembly language program (of the equivalent binary instructions), generated by a compiler from the following pseudo-code program. Assume all variables are integer.

  Describe array list and its use

Would the ArrayList be better suited for use? Do not just suppose ArrayLists are always used, normal array is still used very often.

  Write a program to declare a class person

Write a program to declare a class 'person' with members name, age and address. Derive a class employee from person having member salary and designation. Input the data for two employees and display it.

  Discuss benefits and drawbacks of perl.

Write a 3 to 5 page paper discussing the above, particularly the benefits and drawbacks of Perl.

  Create a checkbook program using c

Create a Checkbook program using C: Create a Check structure. Include: Check number (should be an integer). Date (use type char[ ])

  Write program to ask user for low and high integer

Write a program that asks the user for the low and high integer in a range of integers. The program then asks the user for integers to be added up.

  Write a set-list of enumerated constants for week

Write a set/list of enumerated constants called week which contains days of the week. Have variable called today that is of type week. Allot value to today.

  Write stored function that takes in zipcode as parameters

Write a stored function called zip_exist that takes in a zipcode.zip%Type parameter and returns a Boolean. The function will return TRUE if the zipcode passed into it exists.

  Write the code for adding a task

write the code for adding a task to this array when the user enters a task in the first text box and clicks the Add Task button. This code should also blank out the text box. At this point, don't worry about displaying the tasks in the text area f..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd