Title
Simple pre and post processing strategies for patent searching in CLEF intellectual property track 2009
Abstract
The objective of the 2009 CLEF-IP Track was to find documents that constitute prior art for a given patent. We explored a wide range of simple preprocessing and post-processing strategies, using Mean Average Precision (MAP) for evaluation purposes. Once determined the best document representation, we tuned a classical Information Retrieval engine in order to perform the retrieval step. Finally, we explored two different post-processing strategies. In our experiments, using the complete IPC codes for filtering purposes led to greater improvements than using 4-digits IPC codes. The second postprocessing strategy was to exploit the citations of retrieved patents in order to boost scores of cited patents. Combining all selected strategies, we computed optimal runs that reached a MAP of 0.122 for the training set, and a MAP of 0.129 for the official 2009 CLEF-IP XL set.
Year
DOI
Venue
2009
10.1007/978-3-642-15754-7_53
CLEF (1)
Keywords
Field
DocType
different post-processing strategy,4-digits ipc code,post-processing strategy,classical information retrieval engine,clef-ip xl set,clef intellectual property track,training set,simple pre,complete ipc code,clef-ip track,best document representation,mean average precision,post processing strategy,information retrieval,intellectual property
European patent office,Training set,Data mining,Information retrieval,Computer science,Filter (signal processing),Exploit,Document representation,Preprocessor,Intellectual property,Clef
Conference
Volume
ISSN
ISBN
6241
0302-9743
3-642-15753-X
Citations 
PageRank 
References 
8
0.64
5
Authors
4
Name
Order
Citations
PageRank
Julien Gobeill130230.42
Emilie Pasche29915.93
Douglas Teodoro36810.46
P Ruch465038.72