Abstract | ||
---|---|---|
Measuring document relatedness using unsupervised co-occurrence based word relatedness methods is a processing-time and memory consuming task. This paper introduces the application of compact data structures for efficient computation of word relatedness based on corpus statistics. The data structure is used to efficiently lookup: (1) the corpus statistics for the Common Word Relatedness Approach, (2) the pairwise word relatedness for the Algorithm Specific Word Relatedness Approach. These two approaches significantly accelerate the processing time of word relatedness methods and reduce the space cost of storing co-occurrence statistics in memory, making text mining tasks like classification and clustering based on word relatedness practical. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1145/2682571.2797088 | DocEng |
Field | DocType | Citations |
Data structure,Pairwise comparison,Text mining,Computer science,Co-occurrence,Natural language processing,Artificial intelligence,Cluster analysis,Computation | Conference | 1 |
PageRank | References | Authors |
0.36 | 8 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jie Mei | 1 | 1 | 3.06 |
Xinxin Kou | 2 | 1 | 0.36 |
Zhimin Yao | 3 | 1 | 0.36 |
Andrew Rau-chaplin | 4 | 638 | 61.65 |
Aminul Islam | 5 | 328 | 31.16 |
Abidalrahman Moh'd | 6 | 38 | 8.92 |
Evangelos E. Milios | 7 | 290 | 41.22 |