Abstract | ||
---|---|---|
In this paper we describe the winning model for the performance measure "lowest ranked homologous sequence" (RKL). This was a subtask of the Protein Homology Prediction task of the KDD Cup 2004. The goal was to predict protein homology for different performance metrics. The given data was organized in blocks, each of which corresponds to a specific native sequence. The two metrics average precision (APR) and RKL explicitly make use of this block structure. Our solution consists of two parts. The first one is a global classification SVM not aware of the block structure. The second part is a k-NearestNeighbor scheme for block similarity, used to train ranking SVMs on the fly. Furthermore, we sketch our approach to optimize the root-mean-squared-error and report some alternative solutions that turned out to be suboptimal. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1145/1046456.1046477 | SIGKDD Explorations |
Keywords | Field | DocType |
kdd cup,different performance metrics,block structure,protein homology prediction task,global classification,alternative solution,specific native sequence,block similarity,performance measure,metrics average precision,protein homology task,root mean square error | Data mining,Block structure,Ranking,Computer science,Support vector machine,On the fly,Artificial intelligence,Homology (biology),Artificial neural network,Machine learning,Sketch | Journal |
Volume | Issue | Citations |
6 | 2 | 1 |
PageRank | References | Authors |
0.36 | 3 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Christophe Foussette | 1 | 5 | 1.10 |
Daniel Hakenjos | 2 | 1 | 0.36 |
Martin Scholz | 3 | 544 | 45.31 |