Abstract | ||
---|---|---|
In this paper we present a modified cosine similarity metric that helps to make features more discriminative. The new metric is defined via various linear transformations of the original feature space to a space in which these samples are better separated. These transformations are learned from a set of constraints representing available domain knowledge by solving related optimization problems. We present results on two natural language call routing datasets that show significant improvements ranging from 3% to 5% absolute in the purity of clusters obtained in an unsupervised fashion. |
Year | Venue | Keywords |
---|---|---|
2011 | 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | constrained clustering, cosine metric, SVM, TF-IDF |
Field | DocType | Citations |
Pattern recognition,Computer science,Cosine Distance,Speech recognition,Artificial intelligence,Cluster analysis | Conference | 0 |
PageRank | References | Authors |
0.34 | 1 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Leonid Rachevsky | 1 | 19 | 2.54 |
Dimitri Kanevsky | 2 | 477 | 54.37 |
Ruhi Sarikaya | 3 | 698 | 64.49 |
Bhuvana Ramabhadran | 4 | 1779 | 153.83 |