Title
The term vector database: fast access to indexing terms for Web pages
Abstract
We have built a database that provides term vector information for large numbers of pages (hundreds of millions). The basic operation of the database is to take URLs and return term vectors. Compared to computing vectors by downloading pages via HTTP, the Term Vector Database is several orders of magnitude faster, enabling a large class of applications that would be impractical without such a database. This paper describes the Term Vector Database in detail. It also reports on two applications built on top of the database. The first application is an optimization of connectivity-based topic distillation. The second application is a Web page classifier used to annotate results returned by a Web search engine.
Year
DOI
Venue
2000
10.1016/S1389-1286(00)00046-3
Computer Networks
Keywords
Field
DocType
Page classification,Term vectors,Topic distillation,Web connectivity,Web search
Web search engine,Static web page,World Wide Web,Information retrieval,Web page,Web mapping,Computer science,Search engine indexing,Data Web,Database schema,Rewrite engine,Database
Journal
Volume
Issue
ISSN
33
1-6
Computer Networks
Citations 
PageRank 
References 
13
2.80
4
Authors
3
Name
Order
Citations
PageRank
Raymie Stata11968245.65
Krishna A. Bharat21211252.86
Farzin Maghoul31198173.90