Abstract | ||
---|---|---|
Acronyms are abbreviations formed from the initial components of words or phrases. Acronym usage is becoming more common in web searches, email, text messages, tweets, blogs and posts. Acronyms are typically ambiguous and often disambiguated by context words. Given either just an acronym as a query or an acronym with a few context words, it is immensely useful for a search engine to know the most likely intended meanings, ranked by their likelihood. To support such online scenarios, we study the offline mining of acronyms and their meanings in this paper. For each acronym, our goal is to discover all distinct meanings and for each meaning, compute the expanded string, its popularity score and a set of context words that indicate this meaning. Existing approaches are inadequate for this purpose. Our main insight is to leverage "co-clicks" in search engine query click log to mine expansions of acronyms. There are several technical challenges such as ensuring 1:1 mapping between expansions and meanings, handling of "tail meanings" and extracting context words. We present a novel, end-to-end solution that addresses the above challenges. We further describe how web search engines can leverage the mined information for prediction of intended meaning for queries containing acronyms. Our experiments show that our approach (i) discovers the meanings of acronyms with high precision and recall, (ii) significantly complements existing meanings in Wikipedia and (iii) accurately predicts intended meaning for online queries with over 90% precision. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2488388.2488498 | WWW |
Keywords | Field | DocType |
intended meaning,tail meaning,likely intended meaning,context word,mining acronym expansion,search engine query click,high precision,search engine,web search engine,web search,query click log,distinct meaning,acronym | Acronym,Data mining,World Wide Web,Search engine,Ranking,Information retrieval,Computer science,Popularity,Precision and recall | Conference |
ISBN | Citations | PageRank |
978-1-4503-2035-1 | 4 | 0.42 |
References | Authors | |
14 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bilyana Taneva | 1 | 410 | 14.37 |
Tao Cheng | 2 | 238 | 11.50 |
Kaushik Chakrabarti | 3 | 2432 | 299.04 |
Yeye He | 4 | 319 | 20.19 |