Title
The Readability of Tweets and their Geographic Correlation with Education.
Abstract
Twitter has rapidly emerged as one of the largest worldwide venues for written communication. Thanks to the ease with which vast quantities of tweets can be mined, Twitter has also become a source for studying modern linguistic style. The readability of text has long provided a simple method to characterize the complexity of language and ease that documents may be understood by readers. In this note we use a modified version of the Flesch Reading Ease formula, applied to a corpus of 17.4 million tweets. We find tweets have characteristically more difficult readability scores compared to other short format communication, such as SMS or chat. This linguistic difference is insensitive to the presence of "hashtags" within tweets. By utilizing geographic data provided by 2% of users, joined with "ZIP Code Tabulation Area" (ZCTA) level education data from the U.S. Census, we find an intriguing correlation between the average readability and the college graduation rate within a ZCTA. This points towards a difference in either the underlying language, or a change in the type of content being tweeted in these areas
Year
Venue
Field
2014
CoRR
Data mining,World Wide Web,Computer science,Readability,Correlation
DocType
Volume
Citations 
Journal
abs/1401.6058
5
PageRank 
References 
Authors
0.67
4
2
Name
Order
Citations
PageRank
James R. A. Davenport150.67
Robert DeLine22957210.35