Title | ||
---|---|---|
The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective |
Abstract | ||
---|---|---|
CiteSeerX is a crawl-based digital library search engine providing free access to more than 4 million academic papers. Since metadata in the digital library is obtained through automatic extraction, it is inevitable that errors will occur. CiteSeerX offers a feature allowing registered users to correct paper metadata including titles, authors, abstracts, publication years, venues, etc. We claim that user corrections, as a form of crowd-collaboration, provide a useful and efficient way to improve metadata quality and the impact of the digital library. As evidence to support this claim, we investigate user corrections from the last 5 years and analyze: the nature of the corrections; the quality of the corrections; and the impact of the corrections on downloads. |
Year | DOI | Venue |
---|---|---|
2014 | 10.4108/icst.collaboratecom.2014.257563 | CollaborateCom |
Keywords | Field | DocType |
citeseerx,digital libraries,paper metadata correction,crawl-based digital library search engine,crowd-collaboration,user corrections,meta data,groupware,search engines,history | Metadata quality,Metadata,World Wide Web,Search engine,Information retrieval,Computer science,Digital library | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jian Wu | 1 | 22 | 6.11 |
Kyle Williams | 2 | 208 | 21.61 |
Madian Khabsa | 3 | 237 | 18.81 |
C. Lee Giles | 4 | 11154 | 1549.48 |