Research on Automatic Chinese Multi-word Term Extraction Based on Integration of Web Information and Term Component - Citegraph

Paper Info

Title
Research on Automatic Chinese Multi-word Term Extraction Based on Integration of Web Information and Term Component

Abstract
This paper presents an automatic Chinese multi-word term extraction method based on the integration of Web information and term component. We extract candidate terms by identifying delimiters, and filter invalid terms by checking the context terms in the Google result pages that are returned by Google when the candidate term is set as search request. Term component is taken into account to estimate the termhood. Inspired by the economical law of term generating, we propose two measures of a candidate term to be a true term: the first measure is based on domain speciality of term, and the second one is based on the similarity between a candidate and a template that contains structured information of terms. Experiments on IT domain and Medicine domain show that our method is effective and portable in different domains.

Year	DOI	Venue
2009	10.1109/WI-IAT.2009.279	Web Intelligence/IAT Workshops
Keywords	Field	DocType
it domain,term component,candidate term,web,term generating,automatic terminology extraction,chinese terminology,medicine domain show,web information,true term,invalid term,different domain,context term,termhood,automatic chinese multi-word term,terminology,statistics,computational intelligence,computational linguistics,educational technology,data mining,control systems,intelligent agent	Data mining,Intelligent agent,Information retrieval,Computational intelligence,Terminology,Computer science,Computational linguistics,Control system,Index term,Delimiter,Compound term processing	Conference
Volume	ISBN	Citations
3	978-1-4244-5331-3	1
PageRank	References	Authors
0.43	10	3

Authors (3 rows)

Cited by (1 rows)

References (10 rows)

Name	Order	Citations	PageRank
Wei Kang	1	360	88.51
Zhifang Sui	2	172	39.06
Yao Liu	3	2	6.19

1