Title | ||
---|---|---|
A new parallelization model for detecting temporal bursts in large-scale document streams on a multi-core CPU |
Abstract | ||
---|---|---|
Burstiness is the simplest but the most robust criterion for detecting topics and events in online documents. Online documents are referred to as document streams because they have a temporal order. Kleinberg's temporal burst detection algorithm is the most successful algorithm for detecting bursty periods related to a topic- or event-related keyword. Kleinberg's temporal burst detection algorithm aims to find certain time periods in which a keyword occurs at a high frequency. In recent times, large-scale online documents are increasingly common on social media. Therefore, speed-up of burst-detection processing is one of the most important issues in this era of big data. In this paper, we propose a novel parallelization model, called the hybrid parallelization model with a hidden I/O thread, to enable the parallel processing of Kleinberg's temporal burst detection algorithm on a multi-core CPU. In a multi-core CPU environment, I/O latency is a critical issue for improving the performance of a parallelization model. To automatically hide the I/O latency, the proposed parallelization model utilizes speculative I/Os. The results of experiments using actual large-scale document streams show that the proposed parallelization model performs well compared with a conventional parallelization model. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/SMC.2014.6973960 | Systems, Man and Cybernetics |
Keywords | DocType | ISSN |
Big Data,information retrieval,multiprocessing systems,parallel processing,social networking (online),I/O latency,Kleinberg's temporal burst detection algorithm,big data,burst-detection processing,event-related keyword,hidden I/O thread,hybrid parallelization model,large-scale document streams,multicore CPU environment,online document topic detection,parallel processing,parallelization model,social media,temporal burst detection,topic-related keyword,Burst detection,Document stream,Multi-core CPU,Parallel processing | Conference | 1062-922X |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Keiichi Tamura | 1 | 37 | 13.86 |
H. Kitakami | 2 | 94 | 49.68 |