Abstract | ||
---|---|---|
Detection of discriminant terms allow us to improve the performance of natural language processing systems. The goal is to be able to find the possible term contribution in a given corpus and, thereafter, to use the terms of high contribution for representing the corpus. In this paper we present various experiments that use elliptic curves with the purpose of discovering discriminant terms of a given textual corpus. Different experiments led us to use the mean and variance of the corpus terms for determining the parameters of a Weierstrass reduced equation (elliptic curve). We use the elliptic curves in order to graphically visualize the behavior of the corpus vocabulary. Thereafter, we use the elliptic curve parameters in order to cluster those terms that share characteristics. These clusters are then used as discriminant terms in order to represent the original document collection. Finally, we evaluated all these corpus representations in order to determine those terms that best discrimine each document. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1007/978-3-642-21587-2_37 | MCPR |
Keywords | Field | DocType |
original document collection,discriminant term,elliptic curve,high contribution,corpus vocabulary,corpus representation,textual corpus,corpus term,term discrimination,elliptic curve parameter,possible term contribution | Discriminant,Computer science,Arithmetic,Term Discrimination,Elliptic curve cryptography,Vocabulary,Elliptic curve | Conference |
Citations | PageRank | References |
0 | 0.34 | 3 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Darnes Vilariño | 1 | 43 | 19.68 |
david pinto | 2 | 26 | 7.99 |
Carlos Balderas | 3 | 6 | 2.12 |
Mireya Tovar | 4 | 31 | 15.59 |
Beatriz Beltrán | 5 | 12 | 11.33 |
Sofia Paniagua | 6 | 0 | 0.34 |