Title | ||
---|---|---|
Identifying long tail term from large-scale candidate pairs for big data-oriented patent analysis. |
Abstract | ||
---|---|---|
Patent is a very important and valuable type of scientific and technical big data. This paper presents how to mine patent text to obtain valuable information/knowledge from large-scale candidates obtained from these patents based on massive patent texts. We firstly propose a patent term extraction method using co-occurrence in the abstract and first-claim sections of patent records. There are three steps: 1 we extract candidate strings according to our definition of a term; 2 we propose an assumption to verify whether a candidate string is a qualified term or not by using the co-occurrence of terms in the abstract and first claim; and 3 we use term frequency-inverse document frequencyAUTHOR: TF-IDF has been defined as \"term frequency-inverse document frequency\". Please check if correct. or mutual information to rank and select candidate terms. Secondly, we propose a new method to obtain valuable long tail term from patents. To fulfill the purpose, 1 we firstly build long tail term-common term pair as candidate set; 2 then we evaluate each candidate pair's value; and finally, 3 to demonstrate our method, we give an example on our result. This study provides a new perspective in extracting terms from free texts of patent records and also proposes a new method to obtain valuable long term to aid information analysis with massive patent texts. Copyright © 2016 John Wiley & Sons, Ltd. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1002/cpe.3792 | Concurrency and Computation: Practice and Experience |
Keywords | Field | DocType |
term extraction,long tail term,patent analysis,scientific big data | Data mining,Information retrieval,Computer science,Mutual information,Patent analysis,Big data | Journal |
Volume | Issue | ISSN |
28 | 15 | 1532-0626 |
Citations | PageRank | References |
3 | 0.40 | 18 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Peng Qu | 1 | 10 | 2.52 |
Junsheng Zhang | 2 | 203 | 25.16 |
Changqing Yao | 3 | 22 | 6.71 |
wen zeng | 4 | 9 | 2.85 |