Title | ||
---|---|---|
All About Microtext A Working Definition And A Survey Of Current Microtext Research Within Artificial Intelligence And Natural Language Processing |
Abstract | ||
---|---|---|
This paper defines a new term, 'Microtext', and takes a survey of the most recent and promising research that falls under this new definition. Microtext has three distinct attributes that differentiate it from the traditional free-text or unstructured text considered within the AI and NLP communities. Microtext is text that is generally very short in length, semi-structured, and characterized by amorphous or informal grammar and language. Examples of microtext include chatrooms (such as IM, XMPP, and IRC), SMS, voice transcriptions, and micro-blogging such as Twitter(tm). This paper expands on this definition, and provides some characterizations of typical microtext data. Microtext is becoming more prevalent. It is the thesis of this paper that the three distinct attributes of microtext yield different results and require different techniques than traditional AI and NLP techniques on long-form free text. By creating a working definition for microtext, providing a survey of the current state of research in the area, it is the goal of this paper to create an understanding of microtext within the AI and NLP communities. |
Year | Venue | Keywords |
---|---|---|
2011 | ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1 | Microtext, Natural language processing, Text classification, Semi-structured data, Information extraction, Sentiment analysis, Topic summarization |
Field | DocType | Citations |
Computer science,Natural language processing,Artificial intelligence | Conference | 9 |
PageRank | References | Authors |
0.60 | 0 | 1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jeffrey Ellen | 1 | 41 | 4.89 |