Automatic Meaning Discovery Using Google - Citegraph

Paper Info

Title
Automatic Meaning Discovery Using Google

Abstract
Abstract We present a new theory of relative semantics between objects, based on information distance and Kolmogorov,complexity. This theory is then applied to construct a method,to automatically extract the meaning,of words and phrases from the world-wide-web using Google page counts. The approach is novel in its unrestricted problem domain, simplicity of implementation, and manifestly ontological underpinnings. The world-wide-web is the largest database on earth, and the latent semantic context information entered by millions of independent users averages out to provide automatic meaning,of useful quality. We give examples to distinguish between,colors and numbers, cluster names of paintings by 17th century Dutch masters and names of books by English novelists, the ability to understand emergencies, and primes, and we demonstrate the ability to do a simple automatic English-Spanish translation. Finally, we use the WordNet database as an objective baseline against which to judge the performance,of our method. We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87% with the expert crafted WordNet categories.

Year	Venue	Keywords
2006	Kolmogorov Complexity and Applications	support vector machine,binary classification,world wide web,natural language,randomized trial
Field	DocType	Citations
Normalized Google distance,Ontology,Problem domain,Binary classification,Information retrieval,Kolmogorov complexity,Computer science,Information distance,WordNet,Semantics	Conference	16
PageRank	References	Authors
1.47	13	2

Authors (2 rows)

Cited by (16 rows)

References (13 rows)

Name	Order	Citations	PageRank
Rudi Cilibrasi	1	128	13.21
Paul Vitányi	2	2130	287.76

1