Title
Structure-Aware Visualization of Text Corpora.
Abstract
Trying to comprehend the structure and content of large text corpora can be a daunting and often time consuming task. In this paper, we introduce a novel tool that exploits the structural properties for extracting and visualizing the underlying topics in a given dataset. To this end, we make use of a combination of latent topic analysis, discriminative feature selection applied on top of the category structure of corpora, and various ranking methods in order to extract the most representative topics for a given corpus. The visual moniker to depict the outcome of these methods can be chosen based on the context. Such visual representations can be useful for depicting trends, identifying ``hot'' topics, and discovering interesting patterns in the underlying data. As applications, we create example representations for a variety of corpora obtained from conference proceedings, movie summaries, and newsgroup postings. Our user experiments demonstrate the viability of our approach, with a flower-like visualization inspired by the ``wheel of emotion'', for generating high quality representative topics and for unearthing hidden structures and connections in large document corpora.
Year
DOI
Venue
2017
10.1145/3020165.3020182
CHIIR
Field
DocType
Citations 
Feature selection,Information retrieval,Ranking,Visualization,Computer science,Text corpus,Exploit,Mutual information,Artificial intelligence,Natural language processing,Topic analysis,Discriminative model
Conference
0
PageRank 
References 
Authors
0.34
30
3
Name
Order
Citations
PageRank
Jaspreet Singh Suri133729.90
Sergej Zerr215815.85
Stefan Siersdorfer364334.70