Title
Metagenomic reads binning with spaced seeds.
Abstract
The growing number of sequencing projects in medicine and environmental sciences is creating new computational demands in the analysis and processing of these very large datasets. Recently we have proposed an algorithm called MetaProb that can accurately cluster metagenomic reads with a precision that is currently unmatched. The competitive advantage of MetaProb depends on the use of sequence signatures based on contiguous k-mers. Instead of using contiguous k-mers, in this work we explore the use of spaced seeds where mismatches are allowed at carefully predetermined positions. The experimental results show that the use of mismatches can further improve the accuracy and decrease the memory requirements.
Year
DOI
Venue
2017
10.1016/j.tcs.2017.05.023
Theoretical Computer Science
Keywords
Field
DocType
Spaced seeds,Clustering,Metagenomics
Data mining,Competitive advantage,Metagenomics,Cluster analysis,Mathematics
Journal
Volume
ISSN
Citations 
698
0304-3975
2
PageRank 
References 
Authors
0.37
16
3
Name
Order
Citations
PageRank
Samuele Girotto1101.87
Matteo Comin219120.94
Cinzia Pizzi313915.73