Title
Utilizing the world wide web as an encyclopedia: extracting term descriptions from semi-structured texts
Abstract
In this paper, we propose a method to extract descriptions of technical terms from Web pages in order to utilize the World Wide Web as an encyclopedia. We use linguistic patterns and HTML text structures to extract text fragments containing term descriptions. We also use a language model to discard extraneous descriptions, and a clustering method to summarize resultant descriptions. We show the effectiveness of our method by way of experiments.
Year
DOI
Venue
2000
10.3115/1075218.1075280
meeting of the association for computational linguistics
Keywords
DocType
Volume
technical term,resultant description,semi-structured text,term description,language model,html text structure,world wide web,extraneous description,clustering method,web page,linguistic pattern,web pages
Conference
cs.CL/0011001
ISSN
Citations 
PageRank 
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000), pp.488-495, Oct. 2000
21
2.40
References 
Authors
8
2
Name
Order
Citations
PageRank
Atsushi Fujii148659.25
Tetsuya Ishikawa222630.46