Title
Word usage and posting behaviors: modeling blogs with unobtrusive data collection methods
Abstract
We present a large-scale analysis of the content of weblogs dating back to the release of the Blogger program in 1999. Over one million blogs were analyzed from their conception through June 2006. These data was submitted to the Text Analysis: Word Counts program [12], which conducted a word-count analysis using Linguistic Inquiry and Word Counts (LIWC) dictionaries [20] to provide and analyze a representative sample of blogger word usage. Covariation among LIWC dictionaries suggests that blogs vary along five psychologically relevant linguistic dimensions: Melancholy, Socialness, Ranting, Metaphysicality, and Work-Relatedness. These variables and others were subjected to a cluster analysis in an attempt to extract natural usage groups to inform design of blogging systems, the results of which were mixed.
Year
DOI
Venue
2008
10.1145/1357054.1357230
CHI
Keywords
Field
DocType
liwc dictionary,large-scale analysis,word-count analysis,word counts program,blogger program,unobtrusive data collection method,natural usage group,million blogs,blogger word usage,word counts,cluster analysis,personas,information design,user modeling,text analysis,user model,behavior modeling,pca,data collection
Word usage,Data collection,World Wide Web,Computer science,Persona,User modeling,Artificial intelligence,Natural language processing
Conference
Citations 
PageRank 
References 
12
1.37
13
Authors
2
Name
Order
Citations
PageRank
Adam D. I. Kramer120413.91
Kerry Rodden269472.11