Title
An IPC-based vector space model for patent retrieval
Abstract
Determining requirements when searching for and retrieving relevant information suited to a user's needs has become increasingly important and difficult, partly due to the explosive growth of electronic documents. The vector space model (VSM) is a popular method in retrieval procedures. However, the weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs. The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval. The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents. The advantage of the generated indexing vocabulary is that it remains unchanged, even if the document sets, selection algorithms, and parameters are changed, or if wording evolution occurs. Comparison of the proposed method with two traditional methods (entropy and chi-square) in manual and automatic evaluations is presented to verify the feasibility and validity. The results also indicate that the IPC-based indexing vocabulary selection method achieves a higher accuracy and is more satisfactory.
Year
DOI
Venue
2011
10.1016/j.ipm.2010.06.001
Inf. Process. Manage.
Keywords
Field
DocType
traditional method,ipc-based vector space model,wording evolution,patent retrieval,vector space model (vsm),patent mining,indexing vocabulary,patent document,document set,ipc-based indexing vocabulary selection,indexing vocabulary change,indexing vocabulary selection algorithm,popular method,indexation,vector space model
Data mining,International Patent Classification,Information retrieval,Computer science,Patent retrieval,Search engine indexing,Vector space model,Vocabulary,Patent mining
Journal
Volume
Issue
ISSN
47
3
Information Processing and Management
Citations 
PageRank 
References 
23
0.85
23
Authors
2
Name
Order
Citations
PageRank
Yen-Liang Chen1136173.85
Yu-Ting Chiu2558.52