Abstract | ||
---|---|---|
Statistical language models can learn relationships between topics discussed in a document collection and persons, organizations and places mentioned in each document. We present a novel combination of statistical topic models and named-entity recognizers to jointly analyze entities mentioned (persons, organizations and places) and topics discussed in a collection of 330,000 New York Times news articles. We demonstrate an analytic framework which automatically extracts from a large collection: topics; topic trends; and topics that relate entities. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1007/11760146_9 | ISI |
Keywords | Field | DocType |
artificial intelligence,computer security,statistical analysis,modeling | Data mining,Latent semantic indexing,Latent Dirichlet allocation,Computer science,Artificial intelligence,Natural language processing,Topic model,Latent semantic analysis,Language model,Statistical analysis | Conference |
Volume | ISSN | ISBN |
3975 | 0302-9743 | 3-540-34478-0 |
Citations | PageRank | References |
19 | 1.25 | 13 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
David Newman | 1 | 1319 | 73.72 |
Chaitanya Chemudugunta | 2 | 420 | 25.61 |
Padhraic Smyth | 3 | 7148 | 1451.38 |
Mark Steyvers | 4 | 1980 | 156.87 |