Title
History by Diversity: Helping Historians search News Archives.
Abstract
Longitudinal corpora like newspaper archives are of immense value to historical research, and time as an important factor for historians strongly influences their search behaviour in these archives. While searching for articles published over time, a key preference is to retrieve documents which cover the important aspects from important points in time which is different from standard search behavior. To support this search strategy, we introduce the notion of a Historical Query Intent to explicitly model a historian's search task and define an aspect-time diversification problem over news archives. We present a novel algorithm, HistDiv, that explicitly models the aspects and important time windows based on a historian's information seeking behavior. By incorporating temporal priors based on publication times and temporal expressions, we diversify both on the aspect and temporal dimensions. We test our methods by constructing a test collection based on The New York Times Collection with a workload of 30 queries of historical intent assessed manually. We find that HistDiv outperforms all competitors in subtopic recall with a slight loss in precision. We also present results of a qualitative user study to determine wether this drop in precision is detrimental to user experience. Our results show that users still preferred HistDiv's ranking.
Year
DOI
Venue
2018
10.1145/2854946.2854959
CHIIR
DocType
Volume
ISBN
Journal
abs/1810.10251
978-1-4503-3751-9
Citations 
PageRank 
References 
14
0.90
22
Authors
3
Name
Order
Citations
PageRank
Jaspreet Singh Suri133729.90
Wolfgang Nejdl26633556.13
Avishek Anand310211.61