Title
Generating Cohesive Semantic Topics from Latent Factors.
Abstract
Extracting topics from posts in social networks is a challenging and relevant computational task. Traditionally, topics are extracted by analyzing syntactic properties in the messages, assuming a high correlation between syntax and semantics. This work proposes SToC, a new method for generating more cohesive and meaningful semantic topics within a context. SToC post-processes the output of a Non-Negative Matrix Factorization (NMF) method in order to determine which latent factors should be further merged to improve cohesion. Based on NMF's output, SToC defines a topics transition graph and uses Markovian theory to merge pairs of topics mutually reachable in this graph. Experiments on two real data sample from Twitter demonstrate that SToC is statistically better than fair baselines in supervised scenarios and able to determine cohesive and semantically valid topics in unsupervised scenarios.
Year
DOI
Venue
2014
10.1109/BRACIS.2014.56
BRACIS
Keywords
Field
DocType
social networks
Cohesion (chemistry),Sample (statistics),Markov process,Computer science,Matrix decomposition,Probabilistic latent semantic analysis,Natural language processing,Artificial intelligence,Non-negative matrix factorization,Syntax,Machine learning,Semantics
Conference
Citations 
PageRank 
References 
3
0.40
10
Authors
5