Title
ResToRinG CaPitaLiZaTion in #TweeTs
Abstract
The rapid proliferation of microblogs such as Twitter has resulted in a vast quantity of written text becoming available that contains interesting information for NLP tasks. However, the noise level in tweets is so high that standard NLP tools perform poorly. In this pa- per, we present a statistical truecaser for tweets using a 3-gram language model built with truecased newswire texts and tweets. Our truecasing method shows an improvement in named entity recognition and part-of-speech tagging tasks.
Year
DOI
Venue
2015
10.1145/2740908.2743039
WWW (Companion Volume)
Field
DocType
Citations 
Capitalization,Data mining,World Wide Web,Social media,Truecasing,Computer science,Noise level,Microblogging,Natural language processing,Artificial intelligence,Named-entity recognition,Language model
Conference
1
PageRank 
References 
Authors
0.36
16
3
Name
Order
Citations
PageRank
Kamel Nebhi110.36
Kalina Bontcheva22538211.33
Genevieve Gorrell326622.00