Title
Leveraging Clustering Techniques To Facilitate Metagenomic Analysis
Abstract
Machine learning clustering algorithms provide excellent methods for conducting metagenomic analysis with efficiency. This study uses two machine learning algorithms, the self-organizing map and the K-means algorithms, to cluster data from an environmental sample collected from a hot springs habitat and to provide a visual analysis of that data. A data processing pipeline is described that uses the clustering algorithms to identify which reference genomes should be included for further analysis in determining possible organisms that are present in a metagenomic sample. The clustering revealed probable candidates for additional analysis, including a thermophilic, anaerobic bacterium, which is likely to be found in a hot springs environment and serves to validate the functionality of these tools. The machine learning techniques discussed here can serve as a launching point for elucidating protein sequences that could serve as possible reference comparisons to a specific metagenomic sample and lead to further study.
Year
DOI
Venue
2016
10.1080/10798587.2015.1073887
INTELLIGENT AUTOMATION AND SOFT COMPUTING
Keywords
Field
DocType
Metagenomics, Clustering, K-means, Machine learning, Self-organizing map
Data mining,k-means clustering,Data processing,Computer science,Self-organizing map,Metagenomics,Artificial intelligence,Anaerobic bacterium,Cluster analysis,Machine learning
Journal
Volume
Issue
ISSN
22
1
1079-8587
Citations 
PageRank 
References 
0
0.34
10
Authors
3
Name
Order
Citations
PageRank
Damien Ennis100.34
Sergiu Dascalu236279.10
Frederick C. Harris Jr.354778.86