Structure-Aware Visualization of Text Corpora. - Citegraph

Paper Info

Title
Structure-Aware Visualization of Text Corpora.

Abstract
Trying to comprehend the structure and content of large text corpora can be a daunting and often time consuming task. In this paper, we introduce a novel tool that exploits the structural properties for extracting and visualizing the underlying topics in a given dataset. To this end, we make use of a combination of latent topic analysis, discriminative feature selection applied on top of the category structure of corpora, and various ranking methods in order to extract the most representative topics for a given corpus. The visual moniker to depict the outcome of these methods can be chosen based on the context. Such visual representations can be useful for depicting trends, identifying ``hot'' topics, and discovering interesting patterns in the underlying data. As applications, we create example representations for a variety of corpora obtained from conference proceedings, movie summaries, and newsgroup postings. Our user experiments demonstrate the viability of our approach, with a flower-like visualization inspired by the ``wheel of emotion'', for generating high quality representative topics and for unearthing hidden structures and connections in large document corpora.

Year	DOI	Venue
2017	10.1145/3020165.3020182	CHIIR
Field	DocType	Citations
Feature selection,Information retrieval,Ranking,Visualization,Computer science,Text corpus,Exploit,Mutual information,Artificial intelligence,Natural language processing,Topic analysis,Discriminative model	Conference	0
PageRank	References	Authors
0.34	30	3

Authors (3 rows)

Cited by (0 rows)

References (30 rows)

Name	Order	Citations	PageRank
Jaspreet Singh Suri	1	337	29.90
Sergej Zerr	2	158	15.85
Stefan Siersdorfer	3	643	34.70

1