Title
Probabilistic quality assessment based on article's revision history
Abstract
The collaborative efforts of users in social media services such as Wikipedia have led to an explosion in user-generated content and how to automatically tag the quality of the content is an eminent concern now. Actually each article is usually undergoing a series of revision phases and the articles of different quality classes exhibit specific revision cycle patterns. We propose to Assess Quality based on Revision History (AQRH) for a specific domain as follows. First, we borrow Hidden Markov Model (HMM) to turn each article's revision history into a revision state sequence. Then, for each quality class its revision cycle patterns are extracted and are clustered into quality corpora. Finally, article's quality is thereby gauged by comparing the article's state sequence with the patterns of pre-classified documents in probabilistic sense. We conduct experiments on a set of Wikipedia articles and the results demonstrate that our method can accurately and objectively capture web article's quality.
Year
DOI
Venue
2011
10.1007/978-3-642-23091-2_50
DEXA (2)
Keywords
Field
DocType
revision state sequence,revision history,quality class,revision phase,revision cycle pattern,different quality class,quality corpus,specific revision cycle pattern,web article,probabilistic quality assessment,wikipedia article
User-generated content,Data mining,Social media,Information retrieval,State sequence,Computer science,Probabilistic logic,Latent semantic analysis,Hidden Markov model,Database
Conference
Citations 
PageRank 
References 
3
0.39
10
Authors
3
Name
Order
Citations
PageRank
Jingyu Han1164.67
Chuandong Wang250.78
Dawei Jiang338021.67