Title
Extracting focused locations for web pages
Abstract
Most Web pages contain location information, which can be used to improve the effectiveness of search engines. In this paper, we concentrate on the focused locations, which refer to the most appropriate locations associated with Web pages. Current algorithms suffer from the ambiguities among locations, as many different locations share the same name (known as GEO/GEO ambiguity), and some locations have the same name with non-geographical entities such as person names (known as GEO/NON-GEO ambiguity). In this paper, we first propose a new algorithm named GeoRank, which employs a similar idea with PageRank to resolve the GEO/GEO ambiguity. We also introduce some heuristic rules to eliminate the GEO/NON-GEO ambiguity. After that, an algorithm with dynamic parameters to determine the focused locations is presented. We conduct experiments on two real datasets to evaluate the performance of our approach. The experimental results show that our algorithm outperforms the state-of-the-art methods in both disambiguation and focused locations determination.
Year
DOI
Venue
2011
10.1007/978-3-642-28635-3_7
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Keywords
Field
DocType
focused location,web page,current algorithm,geo ambiguity,different location,locations determination,new algorithm,appropriate location,non-geo ambiguity,person name
Data mining,PageRank,Heuristic,Search engine,Web page,Information retrieval,Computer science,Ambiguity
Conference
Volume
Issue
ISSN
7142 LNCS
null
16113349
Citations 
PageRank 
References 
7
0.53
15
Authors
4
Name
Order
Citations
PageRank
Qingqing Zhang110214.76
Peiquan Jin233854.93
Sheng Lin391.92
Lihua Yue434046.44