Whole-Genome Phylogeny by Virtue of Unic Subwords - Citegraph

Paper Info

Title
Whole-Genome Phylogeny by Virtue of Unic Subwords

Abstract
With the progress of modern sequencing technologies a number of complete genomes is now available. Traditional motif discovery tools cannot handle this massive amount of data, therefore the comparison of complete genomes can be carried out only with ad hoc methods. In this work we propose a distance function based on subword compositions, which extends the Average Common Subword approach(ACS) of Ulitsky et al. ACS is closely related to the cross entropy estimated between two entire genome sequences, and thus to some set of ``independent'' subwords, namely their redundant common subwords. Then, we filter their redundant common subwords by means of underlying-paired motifs, which relate to each other regions of two genome sequences. This set of motifs is, by construction, linear in the size of input and without overlap; we call the selected motifs, underlying-paired irredundant common subwords, or simply unic subwords. Preliminary results show the validity of our method, and suggest novel computational approaches for analyzing the evolution of genomes.

Year	DOI	Venue
2012	10.1109/DEXA.2012.10	DEXA Workshops
Keywords	Field	DocType
genome sequence,underlying-paired motif,entire genome sequence,unic subwords,underlying-paired irredundant common subwords,whole-genome phylogeny,complete genomes,cross entropy,massive amount,average common subword approach,redundant common subwords,bioinformatics,genetics,text analysis,sequences,vegetation,phylogeny,genomics	Genome,Cross entropy,Data mining,Computer science,Metric (mathematics),Genomics,Computational biology,Bioinformatics,Phylogenetics	Conference
ISSN	Citations	PageRank
1529-4188	9	0.48
References	Authors
8	2

Authors (2 rows)

Cited by (9 rows)

References (8 rows)

Name	Order	Citations	PageRank
Matteo Comin	1	191	20.94
Davide Verzotto	2	63	3.96

1