Title
Employing topic models for pattern-based semantic class discovery
Abstract
A semantic class is a collection of items (words or phrases) which have semantically peer or sibling relationship. This paper studies the employment of topic models to automatically construct semantic classes, taking as the source data a collection of raw semantic classes (RASCs), which were extracted by applying predefined patterns to web pages. The primary requirement (and challenge) here is dealing with multi-membership: An item may belong to multiple semantic classes; and we need to discover as many as possible the different semantic classes the item belongs to. To adopt topic models, we treat RASCs as "documents", items as "words", and the final semantic classes as "topics". Appropriate preprocessing and postprocessing are performed to improve results quality, to reduce computation cost, and to tackle the fixed-k constraint of a typical topic model. Experiments conducted on 40 million web pages show that our approach could yield better results than alternative approaches.
Year
Venue
Keywords
2009
ACL/IJCNLP
pattern-based semantic class discovery,different semantic class,employing topic model,web page,million web page,multiple semantic class,final semantic class,results quality,topic model,semantic class,typical topic model,raw semantic class,information retrieval,web pages
Field
DocType
Volume
Semantic similarity,Web page,Semantic Web Stack,Information retrieval,Computer science,Artificial intelligence,Semantic grid,Natural language processing,Topic model,Social Semantic Web,Semantic computing,Semantic role labeling
Conference
P09-1
Citations 
PageRank 
References 
16
0.69
21
Authors
4
Name
Order
Citations
PageRank
Huibin Zhang1553.20
Mingjie Zhu2894.32
Shuming Shi362058.27
Ji-Rong Wen44431265.98