Abstract | ||
---|---|---|
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of revenue supporting the web today. Despite the importance of this area, little formal, published research exists. We describe a system that learns how to extract keywords from web pages for advertisement targeting. The system uses a number of features, such as term frequency of each potential keyword, inverse document frequency, presence in meta-data, and how often the term occurs in search query logs. The system is trained with a set of example pages that have been hand-labeled with "relevant" keywords. Based on this training, it can then extract new keywords from previously unseen pages. Accuracy is substantially better than several baseline systems. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1145/1135777.1135813 | WWW |
Keywords | Field | DocType |
advertising keyword,new keyword,contextual advertising,web page,search query log,potential keyword,baseline system,example page,term frequency,inverse document frequency,substantial source,web pages,information extraction,advertising | Static web page,Data mining,Advertising,Web page,Computer science,Keyword extraction,Web query classification,Web search query,World Wide Web,Contextual advertising,Information retrieval,tf–idf,Information extraction | Conference |
ISBN | Citations | PageRank |
1-59593-323-9 | 173 | 8.89 |
References | Authors | |
18 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wen-tau Yih | 1 | 3238 | 204.01 |
Joshua Goodman | 2 | 1079 | 146.02 |
Vitor R. Carvalho | 3 | 672 | 36.38 |