Abstract | ||
---|---|---|
Understanding newly emerging events or topics associated with a particular region of a given day can provide deep insight on the critical events occurring in highly evolving metropolitan cities. We propose herein a novel topic modeling approach on text documents with spatio-temporal information (e.g., when and where a document was published) such as location-based social media data to discover prevalent topics or newly emerging events with respect to an area and a time point. We consider a map view composed of regular grids or tiles with each showing topic keywords from documents of the corresponding region. To this end, we present a tilebased spatio-temporally exclusive topic modeling approach called STExNMF, based on a novel nonnegative matrix factorization (NMF) technique. STExNMF mainly works based on the two following stages: (1) first running a standard NMF of each tile to obtain general topics of the tile and (2) running a spatiotemporally exclusive NMF on a weighted residual matrix. These topics likely reveal information on newly emerging events or topics of interest within a region. We demonstrate the advantages of our approach using the geo-tagged Twitter data of New York City. We also provide quantitative comparisons in terms of the topic quality, spatio-temporal exclusiveness, topic variation, and qualitative evaluations of our method using several usage scenarios. In addition, we present a fast topic modeling technique of our model by leveraging parallel computing. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/ICDM.2017.53 | 2017 IEEE International Conference on Data Mining (ICDM) |
Keywords | Field | DocType |
Topic modeling,social network analysis,matrix factorization,event detection,anomaly detection | Data mining,Data modeling,Social media,Time point,Matrix (mathematics),Computer science,Qualitative Evaluations,Non-negative matrix factorization,Topic model,Method of mean weighted residuals | Conference |
ISSN | ISBN | Citations |
1550-4786 | 978-1-5386-2449-4 | 3 |
PageRank | References | Authors |
0.37 | 16 | 11 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dear Sungbok Shin | 1 | 4 | 0.73 |
minsuk choi | 2 | 9 | 2.51 |
Jinho Choi | 3 | 1642 | 206.06 |
Scott Langevin | 4 | 11 | 2.60 |
Christopher Bethune | 5 | 4 | 0.73 |
Philippe Horne | 6 | 4 | 0.73 |
Nathan Kronenfeld | 7 | 4 | 1.07 |
Ramakrishnan Kannan | 8 | 133 | 18.57 |
Barry L. Drake | 9 | 100 | 11.59 |
Haesun Park | 10 | 3546 | 232.42 |
Jaegul Choo | 11 | 556 | 46.81 |