Doing analysis on bioinformatics data

Assignment Help Other Subject
Reference no: EM132557944

Doing analysis on bioinformatics data

Need to convert to gtf format
Convert it to grange object before going ahead with analysis

Need to run community network analysis using Infomap

Grange object is supposed to be like this

GRanges object with 2280612 ranges and 14 metadata columns:

seqnames ranges strand | source type score phase exon_id
<Rle> <IRanges> <Rle> | <factor> <factor> <numeric> <integer> <character>
[1] 1 11869-12227 + | processed_transcript exon <NA> <NA> ENSE00002234944
[2] 1 11872-12227 + | unprocessed_pseudogene exon <NA> <NA> ENSE00002234632
[3] 1 11874-12227 + | unprocessed_pseudogene exon <NA> <NA> ENSE00002269724
[4] 1 12010-12057 + | transcribed_unprocessed_pseudogene exon <NA> <NA> ENSE00001948541
[5] 1 12179-12227 + | transcribed_unprocessed_pseudogene exon <NA> <NA> ENSE00001671638
... ... ... ... . ... ... ... ... ...
[2280608] Y 28774418-28774584 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001741452
[2280609] Y 28776794-28776896 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001681574
[2280610] Y 28779492-28779578 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001638296
[2280611] Y 28780670-28780799 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001797328
[2280612] Y 59001391-59001635 + | processed_pseudogene exon <NA> <NA> ENSE00001794473
exon_number gene_biotype gene_id gene_name transcript_id transcript_name tss_id p_id protein_id
<character> <character> <character> <character> <character> <character> <character> <character> <character>
[1] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000456328 DDX11L1-002 TSS15000 <NA> <NA>
[2] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000515242 DDX11L1-201 TSS190873 <NA> <NA>
[3] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000518655 DDX11L1-202 TSS190874 <NA> <NA>
[4] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000450305 DDX11L1-001 TSS137040 <NA> <NA>
[5] 2 pseudogene ENSG00000223972 DDX11L1 ENST00000450305 DDX11L1-001 TSS137040 <NA> <NA>
... ... ... ... ... ... ... ... ... ...
[2280608] 4 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280609] 3 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280610] 2 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280611] 1 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280612] 1 pseudogene ENSG00000235857 CTBP2P1 ENST00000431853 CTBP2P1-001 TSS99299 <NA> <NA>
using library(rtracklayer)
granges=import(input file)
the input file is a gtf file

You need to convert this file into a gtf file

then when you run granges=import(input file)

the outpus of the granges object should be like the above

If this is possible, l will then need help to run multi-level community detection through Infomap and Louvain methods, and you will provide me with the codes

The code used to run the analysis and the analysis itself

This issue am having converting my bedfile to a gtf file that can give out a grange object like the above library(rtracklayer) granges <-import(‘path to gtf file')

1) perform multi-level community detection

2) make sure that the networks are scale-free to get informative sub-networks.

3) Generate at least 100 random networks from thedata to see the difference between the real networks and the random ones. here, you need to consider the assortativity of the original networks vs the random ones.

This is the points to consider for the multi-level community detection through Infomap and Louvain methods

Just want to be convinced that the bed to gtf will work and it's grange object will look like the one above

For the Bed to gtf, have prepared the file already, all it need is just a line of code here is nothing much, when l did my own convertion and ran the grange line of code, l could not get all parameters to like what l posted above

Have you seen the file for the the multi-level community detection through Infomap and Louvain methods

Please for the second one, the expert will need to follow the instructions

1) Perform multi-level community detection

2) make sure that the networks are scale-free to get informative sub-networks.

3) Generate at least 100 random networks from thedata to see the difference between the real networks and the random ones. here, you need to consider the assortativity of the original networks vs the random ones.

Secondly, the next work is to do multi-level community detection through Infomap and Louvain with file attached infomap.net, the bed file is realenhancer.bed

Please for the second one, the expert will need to follow the instructions

1) Perform multi-level community detection

2) make sure that the networks are scale-free to get informative sub-networks.

3) Generate at least 100 random networks from the data to see the difference between the real networks and the random ones. here, you need to consider the assortativity of the original networks vs the random ones.

Attachment:- Bioinformatics data.rar

Reference no: EM132557944

Questions Cloud

Model of the communication process in action : Define the model of the communication process in action, including all steps of the process. Briefly explain each step.
Define philosophy of religion : How do you define philosophy of religion? Which of the five religious traditions added the most to your understanding of religious philosophy?
Advise the company and what the company should do : Advise the company of the above and what the company should do. In your answer please refer to relevant case law and legislation
Case analysis-clark county family services : What type of analytics solution is needed? Identify the opportunity for an analytics solution and explain why this solution is feasible.
Doing analysis on bioinformatics data : Doing analysis on bioinformatics data - perform multi-level community detection - Generate at least 100 random networks from the data
Construct a statement of retained earnings for the month : Construct a statement of retained earnings for the month of August 2019 and the stockholders' equity section of the balance sheet as of August 31, 2019
Differences between endotherms and exotherms : Differences between asymmetrical, radial symmetry and bilateral symmetry. Differences between endotherms and exotherms.
The Human Body And The Flight Environment : As mentioned in the Module Preview, there exists an inextricable relationship between the human body and the flight environment.
Describe the four most common types of teams : List and describe the four most common types of teams likely to be found in today's organizations?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd