Title
A Study on Classification in Imbalanced and Partially-Labelled Data Streams
Abstract
The domain of radio astronomy is currently facing significant computational challenges, foremost amongst which are those posed by the development of the world's largest radio telescope, the Square Kilometre Array (SKA). Preliminary specifications for this instrument suggest that the final design will incorporate between 2000 and 3000 individual 15 metre receiving dishes, which together can be expected to produce a data rate of many TB/s. Given such a high data rate, it becomes crucial to consider how this information will be processed and stored to maximise its scientific utility. In this paper, we consider one possible data processing scenario for the SKA, for the purposes of an all-sky pulsar survey. In particular we treat the selection of promising signals from the SKA processing pipeline as a data stream classification problem. We consider the feasibility of classifying signals that arrive via an unlabelled and heavily class imbalanced data stream, using currently available algorithms and frameworks. Our results indicate that existing stream learners exhibit unacceptably low recall on real astronomical data when used in standard configuration, however, good false positive performance and comparable accuracy to static learners, suggests they have definite potential as an on-line solution to this particular big data challenge.
Year
DOI
Venue
2013
10.1109/SMC.2013.260
SMC
Keywords
DocType
Volume
Big Data,astronomy computing,pattern classification,radioastronomy,radiotelescopes,Big Data challenge,SKA,Square Kilometre Array,all-sky pulsar survey,astronomical data,data processing,imbalanced data stream classification,partially-labelled data stream classification,radio astronomy,radio telescope,signal classification,Astroinformatics,Classification,Data Streams,Imbalanced Learning,Unlabelled Data
Journal
abs/1307.8012
ISSN
Citations 
PageRank 
1062-922X
1
0.36
References 
Authors
6
4
Name
Order
Citations
PageRank
R. J. Lyon191.13
J. M. Brooke2539.77
J. D. Knowles310.36
B. W. Stappers491.46