Abstract | ||
---|---|---|
Geolocalized databases are becoming necessary in a wide variety of application domains. Thus far, the creation of such databases has been a costly, manual process. This drawback has stimulated interest in automating their construction, for example, by mining geographical information from the Web. Here we present and evaluate a new automated technique for creating and enriching a geographical gazetteer, called Gazetiki. Our technique merges disparate information from Wikipedia, Panoramio, and web search engines in order to identify geographical names, categorize these names, find their geographical coordinates and rank them. The information produced in Gazetiki enhances and complements the Geonames database, using a similar domain model. We show that our method provides a richer structure and an improved coverage compared to another known attempt at automatically building a geographic database and, where possible, we compare our Gazetiki to Geonames. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1145/1378889.1378906 | JCDL |
Keywords | Field | DocType |
application domain,geographical gazetteer,improved coverage,geolocalized databases,geographic database,geographical name,automatic creation,geonames database,geographical information,new automated technique,disparate information,domain model,data mining,wikipedia,web search engine,information extraction | Drawback,Automated technique,World Wide Web,Search engine,Information retrieval,Computer science,Geographic coordinate system,Geographic database,Information extraction,Domain model | Conference |
Citations | PageRank | References |
36 | 2.12 | 9 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Adrian Popescu | 1 | 263 | 20.15 |
Gregory Grefenstette | 2 | 1129 | 147.00 |
Pierre Alain Moëllic | 3 | 61 | 6.24 |