Title
Classification of Tandem Repeats in the Human Genome
Abstract
Tandem repeats in DNA sequences are extremely relevant in biological phenomena and diagnostic tools. Computational programs that discover these tandem repeats generate a huge volume of data, which is often difficult to decipher without further organization. In this paper, the authors describe a new method for post-processing tandem repeats through clustering and classification. Their work presents multiple ways of expressing tandem repeats using the n-gram model with different clustering distance measures. Analysis of the clusters for the tandem repeats in the human genome shows that the method yields a well-defined grouping in which similarity among repeats is apparent. The authors' new, alignment-free method facilitates the analysis of the myriad of tandem repeats that occur in the human genome and they believe that this work will lead to new discoveries on the roles, origins, and significance of tandem repeats.
Year
DOI
Venue
2012
10.4018/jkdb.2012070101
IJKDB
Keywords
Field
DocType
computational program,dna sequence,biological phenomenon,alignment-free method,human genome,different clustering distance measure,tandem repeats,new method,method yield,tandem repeat,new discovery,n grams,classification,clustering
Tandem repeat,Hybrid genome assembly,DECIPHER,Computer science,Direct repeat,Variable number tandem repeat,DNA sequencing,Bioinformatics,Human genome,Cluster analysis
Journal
Volume
Issue
Citations 
3
3
0
PageRank 
References 
Authors
0.34
10
4
Name
Order
Citations
PageRank
Yupu Liang1111.43
Dina Sokol214412.84
Sarah Zelikovitz318116.42
Sarah Ita Levitan400.34