Title
ChronoClust: Density-based clustering and cluster tracking in high-dimensional time-series data.
Abstract
In many scientific disciplines, the advent of new high-throughput technologies is giving rise to vast quantities of high-dimensional time-series data. A common requirement is to identify clusters of data-points with similar characteristics in this experimental data, and track their development over time. In this article we present ChronoClust, a novel density-based clustering algorithm for processing a time-series of discrete datasets, generating arbitrarily shaped clusters, and explicitly tracking their temporal evolution. We provide a conceptualisation of ChronoClust’s parameters, and guidelines for selecting their values. The development of ChronoClust was motivated by the need to characterise the immune response to disease. As such, we demonstrate and evaluate ChronoClust’s operation on two immune-related datasets: (1) a synthetic dataset exhibiting the temporal evolution qualities of the immune response as they would be observed through mass cytometry, a cutting edge high-throughput technology, and (2) a Flow cytometry dataset capturing the immune response in West Nile Virus (WNV)-infected mice. Our comprehensive qualitative and quantitative analyses confirm ChronoClust’s suitability for this type of problem: the temporal relationships engineered into the synthetic dataset are successfully recovered, and the cell populations and dynamics unveiled in the WNV dataset match those identified through a domain expert. ChronoClust is applicable beyond Immunology, and we provide an open source Python implementation to support its adoption more widely. We additionally make our two datasets publicly available to promote reproducible research and third-party work on temporal clustering and cluster tracking.
Year
DOI
Venue
2019
10.1016/j.knosys.2019.02.018
Knowledge-Based Systems
Keywords
Field
DocType
Density based clustering,Data mining,Temporal cluster tracking,Cytometry,Immunology,West Nile virus,Bioinformatics,Exploratory data analysis
Time series,Data mining,Cluster (physics),Experimental data,Subject-matter expert,Computer science,Artificial intelligence,Cluster analysis,Machine learning,West Nile virus,Python (programming language)
Journal
Volume
ISSN
Citations 
174
0950-7051
1
PageRank 
References 
Authors
0.36
15
7
Name
Order
Citations
PageRank
Givanna H. Putri111.03
Mark Read2123.93
Irena Koprinska3103.93
Deeksha Singh410.70
Uwe Röhm511.03
Thomas M. Ashhurst610.70
Nicholas J. C. King710.70