Title
A genome analysis based on repeat sharing gene networks
Abstract
Motivated by an interest to understand how information is organized within genomes, and how genes communicate between each other in the transcription process, in this paper we propose a novel network based methodology for genomic sequence analysis, specifically applied to three organisms: Nanoarchaeum equitans, Escherichia coli, and Saccaromyces cerevisiae. A dictionary based approach previously introduced is here continued through a repeat analysis in genic and intergenic regions. Key results of this work have been found in a biological and computational analysis of novel parametrized gene networks, defined by means of motifs of fixed length occurring inside multiple genes. Cliques emerge as groups of genes sharing a long repeat with a clear biological interpretation, while a (complete, paralog) cluster analysis has outlined some unexpected regularity. Repeat sharing gene networks may be applied in contexts of comparative genomics, as an investigation methodology for a comprehension of evolutional and functional properties of genes.
Year
DOI
Venue
2015
10.1007/s11047-014-9437-6
Natural Computing: an international journal
Keywords
Field
DocType
Genome analysis,Computational genomics,Comparative genomics,Infogenomics,Dictionaries,Word frequency,Repeat,k,-mer,Gene networks,Repeat-sharing gene networks,Paralog analysis,Cluster information theory,Text mining,Metagenomics,Nanoarchaeum equitans,Escherichia coli,Saccaromyces cerevisiae
Genome,Gene,Comparative genomics,Artificial intelligence,Computational biology,Nanoarchaeum equitans,Computational genomics,Gene regulatory network,Genetics,k-mer,Mathematics,Machine learning,Sequence analysis
Journal
Volume
Issue
ISSN
14
3
1567-7818
Citations 
PageRank 
References 
1
0.38
13
Authors
3
Name
Order
Citations
PageRank
Alberto Castellini16014.16
Giuditta Franco213618.34
alessio milanese310.38