Title
A new parallelization model for detecting temporal bursts in large-scale document streams on a multi-core CPU
Abstract
Burstiness is the simplest but the most robust criterion for detecting topics and events in online documents. Online documents are referred to as document streams because they have a temporal order. Kleinberg's temporal burst detection algorithm is the most successful algorithm for detecting bursty periods related to a topic- or event-related keyword. Kleinberg's temporal burst detection algorithm aims to find certain time periods in which a keyword occurs at a high frequency. In recent times, large-scale online documents are increasingly common on social media. Therefore, speed-up of burst-detection processing is one of the most important issues in this era of big data. In this paper, we propose a novel parallelization model, called the hybrid parallelization model with a hidden I/O thread, to enable the parallel processing of Kleinberg's temporal burst detection algorithm on a multi-core CPU. In a multi-core CPU environment, I/O latency is a critical issue for improving the performance of a parallelization model. To automatically hide the I/O latency, the proposed parallelization model utilizes speculative I/Os. The results of experiments using actual large-scale document streams show that the proposed parallelization model performs well compared with a conventional parallelization model.
Year
DOI
Venue
2014
10.1109/SMC.2014.6973960
Systems, Man and Cybernetics
Keywords
DocType
ISSN
Big Data,information retrieval,multiprocessing systems,parallel processing,social networking (online),I/O latency,Kleinberg's temporal burst detection algorithm,big data,burst-detection processing,event-related keyword,hidden I/O thread,hybrid parallelization model,large-scale document streams,multicore CPU environment,online document topic detection,parallel processing,parallelization model,social media,temporal burst detection,topic-related keyword,Burst detection,Document stream,Multi-core CPU,Parallel processing
Conference
1062-922X
Citations 
PageRank 
References 
0
0.34
0
Authors
2
Name
Order
Citations
PageRank
Keiichi Tamura13713.86
H. Kitakami29449.68