Title
Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources
Abstract
Heterogeneous content is an inherent problem for cross-system search, recommendation and personalization. In this paper we investigate differences in topic coverage and the impact of topics in different kinds of Web services. We use entity extraction and categorization to create fingerprints that allow for meaningful comparison. As a basis taxonomy, we use the 23 main categories of Wikipedia Category Graph, which has been assembled over the years by the wisdom of the crowds. Following a proof of concept of our approach, we analyze differences in topic coverage and topic impact. The results show many differences between Web services like Twitter, Flickr and Delicious, which reflect users' behavior and the usage of each system. The paper concludes with a user study that demonstrates the benefits of fingerprints over traditional textual methods for recommendations of heterogeneous resources.
Year
DOI
Venue
2014
10.1145/2631775.2631797
HT
Keywords
Field
DocType
comparison,miscellaneous,domain independent,fingerprints,twikime,wikipedia,classification
Crowds,Categorization,Graph,World Wide Web,Information retrieval,Computer science,Proof of concept,Web service,Personalization
Conference
Citations 
PageRank 
References 
5
0.47
20
Authors
5
Name
Order
Citations
PageRank
Ricardo Kawase1152.72
Patrick Siehndel212615.69
Bernardo Pereira Nunes318530.96
Eelco Herder458655.28
Wolfgang Nejdl56633556.13