Title
Automatic construction of polarity-tagged corpus from HTML documents
Abstract
This paper proposes a novel method of building polarity-tagged corpus from HTML documents. The characteristics of this method is that it is fully automatic and can be applied to arbitrary HTML documents. The idea behind our method is to utilize certain layout structures and linguistic pattern. By using them, we can automatically extract such sentences that express opinion. In our experiment, the method could construct a corpus consisting of 126,610 sentences.
Year
Venue
Keywords
2006
ACL
novel method,automatic construction,express opinion,html document,arbitrary html document,polarity-tagged corpus,linguistic pattern,certain layout structure
Field
DocType
Volume
Information retrieval,Computer science,Artificial intelligence,Natural language processing
Conference
P06-2
Citations 
PageRank 
References 
26
3.28
15
Authors
2
Name
Order
Citations
PageRank
Nobuhiro Kaji125721.71
Masaru Kitsuregawa23188831.46