Title | ||
---|---|---|
CUSTODES: automatic spreadsheet cell clustering and smell detection using strong and weak features. |
Abstract | ||
---|---|---|
Various techniques have been proposed to detect smells in spreadsheets, which are susceptible to errors. These techniques typically detect spreadsheet smells through a mechanism based on a fixed set of patterns or metric thresholds. Unlike conventional programs, tabulation styles vary greatly across spreadsheets. Smell detection based on fixed patterns or metric thresholds, which are insensitive to the varying tabulation styles, can miss many smells in one spreadsheet while reporting many spurious smells in another. In this paper, we propose CUSTODES to effectively cluster spreadsheet cells and detect smells in these clusters. The clustering mechanism can automatically adapt to the tabulation styles of each spreadsheet using strong and weak features. These strong and weak features capture the invariant and variant parts of tabulation styles, respectively. As smelly cells in a spreadsheet normally occur in minority, they can be mechanically detected as clusters' outliers in feature spaces. We implemented and applied CUSTODES to 70 spreadsheets files randomly sampled from the EUSES corpus. These spreadsheets contain 1,610 formula cell clusters. Experimental results confirmed that CUSTODES is effective. It successfully detected harmful smells that can induce computation anomalies in spreadsheets with an F-measure of 0.72, outperforming state-of-the-art techniques.
|
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2884781.2884796 | ICSE |
Keywords | Field | DocType |
Spreadsheets, cell clustering, smell detection, feature modeling, end-user programming | Data mining,Computer science,Outlier,Feature extraction,Software,Cluster analysis,Spurious relationship,Feature modeling,Computation | Conference |
ISSN | ISBN | Citations |
0270-5257 | 978-1-4503-3900-1 | 16 |
PageRank | References | Authors |
0.56 | 32 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
S. C. Cheung | 1 | 2657 | 162.89 |
Wanjun Chen | 2 | 16 | 0.56 |
Yepang Liu | 3 | 415 | 24.58 |
Chang Xu | 4 | 487 | 36.94 |