Title
A Semantic Triplet Based Story Classifier
Abstract
A story is defined as “an actor(s) taking action(s) that culminates in a resolution(s).” In this paper, we investigate the utility of standard keyword based features, statistical features based on shallow-parsing (such as density of POS tags and named entities), and a new set of semantic features to develop a story classifier. This classifier is trained to identify a paragraph as a “story,” if the paragraph contains mostly story(ies). Training data is a collection of expert-coded story and non-story paragraphs from RSS feeds from a list of extremist web sites. Our proposed semantic features are based on suitable aggregation and generalization of <;Subject, Verb, Object>; triplets that can be extracted using a parser. Experimental results show that a model of statistical features alongside memory-based semantic linguistic features achieves the best accuracy with a Support Vector Machine (SVM) classifier.
Year
DOI
Venue
2012
10.1109/ASONAM.2012.97
Advances in Social Networks Analysis and Mining
Keywords
Field
DocType
support vector machine,semantic triplet,pos tag,proposed semantic feature,best accuracy,expert-coded story,semantic feature,non-story paragraph,memory-based semantic linguistic feature,statistical feature,story classifier,feature extraction,accuracy,linguistics,organizations,support vector machines,semantics,statistical analysis,literature,artificial intelligence,grammars
Rule-based machine translation,Data mining,Computer science,Paragraph,Artificial intelligence,Natural language processing,Classifier (linguistics),Support vector machine,Feature extraction,Parsing,RSS,Machine learning,Semantics
Conference
ISBN
Citations 
PageRank 
978-1-4673-2497-7
5
0.45
References 
Authors
21
5
Name
Order
Citations
PageRank
Betul Ceran1422.65
Ravi Karad250.45
Ajay Mandvekar350.45
Steven R. Corman4978.72
Hasan Davulcu558486.85