Reference no: EM132557944
Doing analysis on bioinformatics data
Need to convert to gtf format
Convert it to grange object before going ahead with analysis
Need to run community network analysis using Infomap
Grange object is supposed to be like this
GRanges object with 2280612 ranges and 14 metadata columns:
seqnames ranges strand | source type score phase exon_id
<Rle> <IRanges> <Rle> | <factor> <factor> <numeric> <integer> <character>
[1] 1 11869-12227 + | processed_transcript exon <NA> <NA> ENSE00002234944
[2] 1 11872-12227 + | unprocessed_pseudogene exon <NA> <NA> ENSE00002234632
[3] 1 11874-12227 + | unprocessed_pseudogene exon <NA> <NA> ENSE00002269724
[4] 1 12010-12057 + | transcribed_unprocessed_pseudogene exon <NA> <NA> ENSE00001948541
[5] 1 12179-12227 + | transcribed_unprocessed_pseudogene exon <NA> <NA> ENSE00001671638
... ... ... ... . ... ... ... ... ...
[2280608] Y 28774418-28774584 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001741452
[2280609] Y 28776794-28776896 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001681574
[2280610] Y 28779492-28779578 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001638296
[2280611] Y 28780670-28780799 - | unprocessed_pseudogene exon <NA> <NA> ENSE00001797328
[2280612] Y 59001391-59001635 + | processed_pseudogene exon <NA> <NA> ENSE00001794473
exon_number gene_biotype gene_id gene_name transcript_id transcript_name tss_id p_id protein_id
<character> <character> <character> <character> <character> <character> <character> <character> <character>
[1] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000456328 DDX11L1-002 TSS15000 <NA> <NA>
[2] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000515242 DDX11L1-201 TSS190873 <NA> <NA>
[3] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000518655 DDX11L1-202 TSS190874 <NA> <NA>
[4] 1 pseudogene ENSG00000223972 DDX11L1 ENST00000450305 DDX11L1-001 TSS137040 <NA> <NA>
[5] 2 pseudogene ENSG00000223972 DDX11L1 ENST00000450305 DDX11L1-001 TSS137040 <NA> <NA>
... ... ... ... ... ... ... ... ... ...
[2280608] 4 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280609] 3 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280610] 2 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280611] 1 pseudogene ENSG00000237917 PARP4P1 ENST00000435945 PARP4P1-001 TSS46893 <NA> <NA>
[2280612] 1 pseudogene ENSG00000235857 CTBP2P1 ENST00000431853 CTBP2P1-001 TSS99299 <NA> <NA>
using library(rtracklayer)
granges=import(input file)
the input file is a gtf file
You need to convert this file into a gtf file
then when you run granges=import(input file)
the outpus of the granges object should be like the above
If this is possible, l will then need help to run multi-level community detection through Infomap and Louvain methods, and you will provide me with the codes
The code used to run the analysis and the analysis itself
This issue am having converting my bedfile to a gtf file that can give out a grange object like the above library(rtracklayer) granges <-import(‘path to gtf file')
1) perform multi-level community detection
2) make sure that the networks are scale-free to get informative sub-networks.
3) Generate at least 100 random networks from thedata to see the difference between the real networks and the random ones. here, you need to consider the assortativity of the original networks vs the random ones.
This is the points to consider for the multi-level community detection through Infomap and Louvain methods
Just want to be convinced that the bed to gtf will work and it's grange object will look like the one above
For the Bed to gtf, have prepared the file already, all it need is just a line of code here is nothing much, when l did my own convertion and ran the grange line of code, l could not get all parameters to like what l posted above
Have you seen the file for the the multi-level community detection through Infomap and Louvain methods
Please for the second one, the expert will need to follow the instructions
1) Perform multi-level community detection
2) make sure that the networks are scale-free to get informative sub-networks.
3) Generate at least 100 random networks from thedata to see the difference between the real networks and the random ones. here, you need to consider the assortativity of the original networks vs the random ones.
Secondly, the next work is to do multi-level community detection through Infomap and Louvain with file attached infomap.net, the bed file is realenhancer.bed
Please for the second one, the expert will need to follow the instructions
1) Perform multi-level community detection
2) make sure that the networks are scale-free to get informative sub-networks.
3) Generate at least 100 random networks from the data to see the difference between the real networks and the random ones. here, you need to consider the assortativity of the original networks vs the random ones.
Attachment:- Bioinformatics data.rar