Title
All About Microtext A Working Definition And A Survey Of Current Microtext Research Within Artificial Intelligence And Natural Language Processing
Abstract
This paper defines a new term, 'Microtext', and takes a survey of the most recent and promising research that falls under this new definition. Microtext has three distinct attributes that differentiate it from the traditional free-text or unstructured text considered within the AI and NLP communities. Microtext is text that is generally very short in length, semi-structured, and characterized by amorphous or informal grammar and language. Examples of microtext include chatrooms (such as IM, XMPP, and IRC), SMS, voice transcriptions, and micro-blogging such as Twitter(tm). This paper expands on this definition, and provides some characterizations of typical microtext data. Microtext is becoming more prevalent. It is the thesis of this paper that the three distinct attributes of microtext yield different results and require different techniques than traditional AI and NLP techniques on long-form free text. By creating a working definition for microtext, providing a survey of the current state of research in the area, it is the goal of this paper to create an understanding of microtext within the AI and NLP communities.
Year
Venue
Keywords
2011
ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1
Microtext, Natural language processing, Text classification, Semi-structured data, Information extraction, Sentiment analysis, Topic summarization
Field
DocType
Citations 
Computer science,Natural language processing,Artificial intelligence
Conference
9
PageRank 
References 
Authors
0.60
0
1
Name
Order
Citations
PageRank
Jeffrey Ellen1414.89