Abstract | ||
---|---|---|
Motivated by the enormous amounts of data collected in a large IT service provider organization, this paper presents a method for quickly and automatically summarizing and extracting meaningful insights from the data. Termed Clustered Subset Selection (CSS), our method enables program-guided data explorations of high-dimensional data matrices. CSS combines clustering and subset selection into a coherent and intuitive method for data analysis. In addition to a general framework, we introduce a family of CSS algorithms with different clustering components such as k-means and Close-to-Rank-One (CRO) clustering, and Subset Selection components such as best rank-one approximation and Rank-Revealing QR (RRQR) decomposition. From an empirical perspective, we illustrate that CSS is achieving significant improvements over existing Subset Selection methods in terms of approximation errors. Compared to existing Subset Selection techniques, CSS is also able to provide additional insight about clusters and cluster representatives. Finally, we present a case-study of program-guided data explorations using CSS on a large amount of IT service delivery data collection. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1145/1458082.1458162 | CIKM |
Keywords | Field | DocType |
css algorithm,clustered subset selection,data analysis,subset selection technique,program-guided data,subset selection component,program-guided data exploration,subset selection method,high-dimensional data matrix,service metrics,it service delivery data,termed clustered subset selection,approximation error,clustering,k means,data collection,service provider,automatic summarization,high dimensional data,service delivery | Data collection,Data mining,Information retrieval,Matrix (mathematics),Computer science,Service provider,Cluster analysis,Service delivery framework | Conference |
Citations | PageRank | References |
3 | 0.42 | 23 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Christos Boutsidis | 1 | 610 | 33.37 |
Jimeng Sun | 2 | 4729 | 240.91 |
Nikos Anerousis | 3 | 240 | 34.23 |