Title
Hourly analysis of a very large topically categorized web query log
Abstract
We review a query log of hundreds of millions of queries that constitute the total query traffic for an entire week of a general-purpose commercial web search service. Previously, query logs have been studied from a single, cumulative view. In contrast, our analysis shows changes in popularity and uniqueness of topically categorized queries across the hours of the day. We examine query traffic on an hourly basis by matching it against lists of queries that have been topically pre-categorized by human editors. This represents 13% of the query traffic. We show that query traffic from particular topical categories differs both from the query stream as a whole and from other categories. This analysis provides valuable insight for improving retrieval effectiveness and efficiency. It is also relevant to the development of enhanced query disambiguation, routing, and caching algorithms.
Year
DOI
Venue
2004
10.1145/1008992.1009048
SIGIR
Keywords
Field
DocType
entire week,web query log,hourly analysis,caching algorithm,general-purpose commercial web search,query stream,cumulative view,large topically,query log,enhanced query disambiguation,total query traffic,query traffic,hourly basis,measurement,cumulant
Query optimization,Data mining,Web search query,Query language,RDF query language,Query expansion,Information retrieval,Computer science,Sargable,Web query classification,Spatial query
Conference
ISBN
Citations 
PageRank 
1-58113-881-4
173
12.63
References 
Authors
23
5
Search Limit
100173
Name
Order
Citations
PageRank
Steven M. Beitzel169646.72
Eric C. Jensen269646.72
Abdur Chowdhury32013160.59
David Grossman452534.73
Ophir Frieder53300419.55