Title | ||
---|---|---|
SeerSuite: developing a scalable and reliable application framework for building digital libraries by crawling the web |
Abstract | ||
---|---|---|
SeerSuite is a framework for scientific and academic digital libraries and search engines built by crawling scientific and academic documents from the web with a focus on providing reliable, robust services. In addition to full text indexing, SeerSuite supports autonomous citation indexing and automatically links references in research articles to facilitate navigation, analysis and evaluation. SeerSuite enables access to extensive document, citation, and author metadata by automatically extracting, storing and indexing metadata. SeerSuite also supports MyCiteSeer, a personal portal that allows users to monitor documents, store user queries, build document portfolios, and interact with the document metadata. We describe the design of SeerSuite and the deployment and usage of CiteSeerx as an instance of SeerSuite. |
Year | Venue | Keywords |
---|---|---|
2010 | WebApps | academic digital library,full text indexing,personal portal,author metadata,reliable application framework,indexing metadata,autonomous citation indexing,academic document,document metadata,extensive document,document portfolio |
Field | DocType | Citations |
Metadata,World Wide Web,Search engine,Crawling,Software deployment,Information retrieval,Computer science,Citation,Search engine indexing,Digital library,Scalability | Conference | 16 |
PageRank | References | Authors |
1.05 | 17 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Pradeep B. Teregowda | 1 | 54 | 5.93 |
Isaac G. Councill | 2 | 469 | 27.27 |
R. Juan Pablo Fernández | 3 | 16 | 1.05 |
Madian Khabsa | 4 | 237 | 18.81 |
Shuyi Zheng | 5 | 256 | 11.22 |
C. Lee Giles | 6 | 11154 | 1549.48 |